Preface

This is the project notebook for the Invasion Heterogeneity project, which uses Jennifer Williams’ Arabidopsis experiments to motivate explorations of the effects of heterogeneity, stochasticity, and evolution on the variation in spread rate.

Usage

This notebook is prepared in bookdown. I ran into a number of poorly documented issues.

  • To use the build tab, you need to go into the Rproj file and set build type to “Website.” In addition, you need to make sure that the lines knit: "bookdown::render_book", site: "bookdown::bookdown_site", and output: bookdown::gitbook are in the yaml header (either in index.Rmd or, presumably, _bookdown.yml)
  • I’m trying to suppress the chapter/section numbers, and only show top-level stuff in the toc. But that doesn’t seem to be working.

Individual content files should be named with the date, in format yyyy-mm-dd.Rmd. The first line should also include the date, in the format, e.g., # (PART) 7 April 2017 {-}. This creates a “part” with the date, and all subjects within the day’s work can be top-level headers.

(PART) 7 April 2017

Conversation with Jennifer Williams

Had a Skype chat today with Jenn, to get back on track with this project.

Ler populations

We recalled the work I did on the Ler populations in July 2015, which showed that:

  • Variance among pots in seed production did not scale with \(1/N\), suggesting something more than simple demographic stochasticity
  • Variance in per-generation spread seemed to be iid—no consistent replicate effects or autocorrelation (but I can’t remember whether I explicitly checked the latter)
  • Among-pot heterogeneity in dispersal was required to reproduce the variance in per-generation spread
  • I had to write a rather complex model!

Evolution experiment

Jenn reminded me that the variation in spread was higher in the the control than in the evolution treatments for three of the four landscapes (there’s a figure in the appendix of the Science paper). However, that comparison was based on the CV (and that might have just been evaluated at the end of the experiment; I don’t recall whethet we looked at each generation’s spread); it might be that the patterns in total variance are different, at least in the fragmented landscape where there was a big difference in mean spread rates.

Predictions and hypotheses

I hypothesized that among-individual heterogeneity in the dispersal kernel, like variation in growth rate in the IPM, generates population level effects that don’t disapear as the population size gets large. This comes from Joe’s work in deterministic settings; the question is then, in a stochastic setting, how does the mean and variance of spread depend on individual heterogeneity?

I also wondered how iid variation in spread would translate into increasing variance among replicates over time. Is it like a random walk, with variance increasing linearly through time? This would be a “random ratchet.” It seems as though the strict non-negativity of the change shouldn’t affect the general result that the variance in total distance spread should just be the sum of the per-generation variances, so that inter-replicate variance would grow linearly with time.

The result for the evolution experiment suggests that the het created by genetic variation is increasing stochasticity of some sort, as for iid het and Poisson variation.

Central question: What are the drivers of variation in spread rate, and are they predictable?

Data

We discussed the data. There are some potentially useful observations (such as plant height in the Ler experiment) that I’ve not seen before. Jenn gave me access to all of her data on Dropbox.

We also discussed temporal variation. Plant performance is probably best in spring and autumn (the greenhouse is too hot in summer), but there were also some pathogen/pest outbreaks that were not seasonal. To the extent that the latter affected the whole greenhouse, we can just use a generation fixed effect.

I wonder if we know the season of the dispersal experiments?

(PART) 10 April 2017

Computing setup

Bookdown

I’m creating this notebook in bookdown. I ran into a number of poorly documented issues.

  • To use the build tab, you need to go into the Rproj file and set build type to “Website.” In addition, you need to make sure that the lines knit: "bookdown::render_book", site: "bookdown::bookdown_site", and output: bookdown::gitbook are in the yaml header (either in index.Rmd or, presumably, _bookdown.yml)
  • I’m trying to suppress the chapter/section numbers, and only show top-level stuff in the toc. There’s a proposed solution here but that doesn’t seem to be working (I don’t know where to put the json file referenced there).
  • I’d hoped not to have subsections display in the sidebar. Sometimes the subsections only show in the currently selected chapter, but that seems not to be consistent behavior.

Individual content files should be named with the date, in format yyyy-mm-dd.Rmd. The first line should also include the date, in the format, e.g., # (PART) 7 April 2017 {-}. This creates a “part” with the date, and all subjects within the day’s work can be top-level headers.

As the notebook gets longer I’ll want to start making good use of preview_chapter(), but it does not (by default) display the page. I’ll need to figure out how to get that working…

I’ll also want to set up a template file, especially once I decide what needs to go into the preamble for R code execution.

I need to look through the bookdown documentation to see what else I can set in _bookdown.yml. A good place to start is the Configuration chapter of Authoring Books with R Markdown. It may be that I can put the index.Rmd file in the project root, and the entries in a subdirectory specifed by rmd_subdir, and have the working directory for the Rmd files be the project root.

Just for grins, let’s see what the working directory is set to:

getwd()
## [1] "/Users/Bkendall/Documents/github-bitbucket/InvasionHet/notebook"
  • With a “standard” setup (the index.Rmd being in notebook) it gives the notebook directory.
  • After moving index.Rmd and _bookdown.yml to the project root, and adding rmd_subdir: true to the latter, I get the project root as the working directory. This is good! However, it appears that this will lead to Rmd files in all subdirectories getting included. Unless I can use regular in the rmd_files option this will be a dealbreaker eventually. I haven’t found any way to do the latter.

Let me try moving everything back to the notebook directory and applying setwd():

setwd("..")
getwd()
## [1] "/Users/Bkendall/Documents/github-bitbucket/InvasionHet"

This seems to work.

I won’t want R code from old entries re-running, not just for time but because data and functions may have changed. I think the key here is to:

  1. Use the K-M approach by putting new_session: yes in _bookdown.yml
  2. Set cache on for everything (i.e., in the knitr options set in the index file).

But this is incompatible with having Rmd files in subdirectories!!!!

In principle, it may save compilation time to have files without R code simply be md files. But it seems that only Rmd files are searched by default; md files would need to be added by hand to rmd_files.

ProjectTemplate

I also set up the directory structure using ProjectTemplate. The one tricky thing is that Rmd files in this notebook (and in the reports directory as well) may have to chdir to the root, which might be tricky. If I keep the index file in the root as described above then this shouldn’t be a problem.

(PART) 11 April 2017

Getting data in via ProjectTemplate

First, here is a draft of the R header chunk that I will want to include in every post that uses R (silently):

knitr::opts_chunk$set(comment = '', cache = TRUE)
knitr::opts_knit$set(root.dir = "..")

I’m sure I’ll come up with some more.

Now, on to how to load the Arabidopsis data using ProjectTemplate. I started with the Ler population data, which is in multiple xlsx files, with one sheet having data, one having metadata, and potentially some blank sheets as well. If I follow the standard protocol for ProjectTemplate and drop them into the data directory, then when I execute load.project() I get a data object for every sheet of every file (because there is no way to pass options to read.xlsx(). In addition, now that they are loaded as data objects, it is hard to access them in a loop to build a the combined datasets. Finally, the workspace is cluttered with all of these objects; if I delete them as part of the data munging, then they will get reloaded the next time I use load.project(), which is time consuming!

So instead I took the data files out of the project directory, and used the option to have a .R file in the data directory. Using this, I can re-use my existing code to read the files in from the Dropbox directory and manipulate them as I want, deleting the intermediate objects and just keeping the final objects. Furthermore, if I then cache the objects, then subsequent loads are fast!

So here’s the file:

cat(readLines('data/popLer.R'), sep = '\n')
### Creates the data objects popLer and popLer_cm, representing the Ler populations
###     experiments

# Raw data consists of one file per generation
# Final column is named inconsistently, so needs to be corrected before merge

data_dir <- "~/Documents/Dropbox/Arabidopsis/Data/Exp1"
data_fname <- list.files(data_dir, "seedposition")

popLer_cm <- NULL
for (i in 1:length(data_fname)) {
  tempdata <- xlsx::read.xlsx(file.path(data_dir, data_fname[i]), sheetName = "Data", 
                              header = TRUE)
  names(tempdata)[9] <- "seedlings"
  popLer_cm <- rbind(popLer_cm, tempdata)
  rm(tempdata)
}

# Clean up column names and get a useful order
popLer_cm <- popLer_cm[-1]
names(popLer_cm) <- c("ID", "Treatment", "Rep", "Gap", "Generation", "Pot", "Distance",
                      "Seedlings")
ord <- with(popLer_cm, order(Treatment, Gap, Rep, Generation, Pot, Distance))
popLer_cm <- popLer_cm[ord,]

# Make a version that just has pot totals
require(plyr)
popLer <- ddply(popLer_cm, .(ID, Gap, Rep, Treatment, Generation, Pot), summarize,
                Seedlings = sum(Seedlings))
ord <- with(popLer, order(Treatment, Gap, Rep, Generation, Pot))
popLer <- popLer[ord,]

Now let’s look at what we get:

ProjectTemplate::load.project()
Loading project configuration
Autoloading helper functions
 Running helper script: helpers.R
Autoloading cache
 Loading cached data set: popLer.cm
 Loading cached data set: popLer
Autoloading data
Munging data
 Running preprocessing script: 01-A.R
ls()
[1] "config"          "helper.function" "popLer"          "popLer_cm"      
[5] "project.info"   
head(popLer)
    ID Gap Rep Treatment Generation Pot Seedlings
359 21  0p   1         A          1   0       132
360 21  0p   1         A          1   1        21
361 21  0p   1         A          2   0        74
362 21  0p   1         A          2   1        72
363 21  0p   1         A          2   2        26
364 21  0p   1         A          2   3         2
head(popLer_cm)
   ID Treatment Rep Gap Generation Pot Distance Seedlings
40 21         A   1  0p          1   0        4       132
42 21         A   1  0p          1   1        9         5
43 21         A   1  0p          1   1       10         6
44 21         A   1  0p          1   1       11         3
45 21         A   1  0p          1   1       13         4
41 21         A   1  0p          1   1       14         3

Looking good!

This has the added advantage that any changes in the data that Jenn makes can be easily loaded by deleting the cache; and that I can share the project on a public github directory (and Jenn can use it as long as we have our Dropbox in the same place).

14 April 2017

Project setup

I have linked the project to a github repository. I’ve excluded from the repository the .Rdata files in cache (which suggests that I should put the cache() commands in the file generating the data, otherwise I need to remember to do them by hand on a new computer) and the notebook/_book directory, which has files created when I build the notebook. I did retain notebook/_bookdown_files, as that is where the caches for the notebook pages are stored. This means that I am saving binary objects (e.g., .Rdata files), but they shouldn’t get updated unless I modify code blocks in past journal entries.

The file notebook/_main.rds also needs to be ignored, as it is regenerated by bookdown each time. Unfortunately it has already been checked in; I need to figure out how to remove it. A quick google reveals that running git rm --cached notebook/_main.rds from the command line does the trick—and I’ve done it.

I’m now experimenting with a new format for the notebook where the “chapter” is the date and the days entries are a second level head. Let’s see how that goes.

I’ve just checked on Jenn’s repository, and it looks like she is using Dropbox directly under her home directory (i.e., file paths read ~/Dropbox). Right now my Dropbox folder is inside ~/Documents (which I think might be the default setup for mac). I have just moved my Dropbox folder up a level (using preferences in the Dropbox app)—I hope that doesn’t break anything else!

Ler population data

I’ve updated popLer.R to:

  • Point to the new Dropbox data directory
  • Automatically cache the files
  • Remove temporary objects that had been generated by the script—otherwise they clutter up the workspace

By deleting the cached .Rdata files and re-running load.project() I confirmed that this all works as expected.

There are two additional datasets from the Ler population, which I have not previously utilized. One is a count of siliques in each pot from treatments A and B, in each generation. The count from treatment B will allow me to look at variation in silique production in isolated plants, and relate siliques to effective seed number.

The other is measurements of plant heights in all pots in generation 5. There are two measurements for each pot: the tallest individual and the height of “the crowd”, which is an effort to estimate typical height in the pot. For treatment B, I can relate height to silique production. I might also be able to use treatment B to look for additional effects of height on dispersal, by simultaneously modeling dispersal from all parent plants and incorporating a height term in the dispersal parameters. In treatment C, I can look for a relationship between the two height measures and density.

As a reminder, here are the treatments:

  • Treatment C is the unmanipulated population, with all seedlings allowed to grow. It was initialized with 50 seeds in generation 0.
  • Treatment B is thinned to one plant per pot at the seedling stage. In “mother pots,” the seedling closest to the center is chosen; in newly colonized pots, the furthest advanced seedling is chosen.
  • Treatment A is thinned just before dispersal, with the goal of leaving the same total number of siliques as one of the Treatment B pots. It was initialized with 50 seeds in generation 0.

As far as I understand it, the seedling counts are before any thinning occurs.

This is documented in ~/Dropbox/Arabidopsis/protocols/Exp1.

I have created two data objects. popLer_cm records position along the runway to the nearest centimeter for newly colonized pots (it looks like in gen 2 it was recorded for all pots but that did not recur). popLer simply records number by pot.

As far as I can tell/recall, there was no separate dispersal experiment for Ler. So the dispersal kernels come from the first generation of continuous runways.

Ler fecundity

Let’s look again at Ler seed production, using generation one data.

First, extract those data for the relevant treatments:

Ler1BC <- subset(popLer, Gap=="0p" & Generation == 1 & Treatment != "A")
Ler1BC$Density <- 1
Ler1BC$Density[Ler1BC$Treatment == "C"] <- 50

Next step: figure out how to use dplyr to get the effective seed number.

17 April 2017

Project stuff

I turned on library loading, but for now just have dplyr and ggplot2

Ler fecundity

(Picking up from Friday)

Let’s look again at Ler seed production, using generation one data.

First, extract those data for the relevant treatments:

Ler1BC <- subset(popLer, Gap=="0p" & Generation == 1 & Treatment != "A")
Ler1BC$Density <- 1
Ler1BC$Density[Ler1BC$Treatment == "C"] <- 50

WARNING

The code that follows calculates effective seed number incorrectly. See the correct version of the analysis in the entry for April 21. (21-04-17)


We calculate the “effective seed number” as the number falling in the mother pot plus twice the number falling in the other pots (the latter gets at bidirectional dispersal, and estimates the number of seeds that would fall somewhere on the runway if the home pot was in the middle of the runway):

seedlings1 <- group_by(Ler1BC, ID, Density) %>% 
  summarise(eff_sd_no = sum(Seedlings) + sum(Seedlings * Pot > 0))

Plot the distributions:

ggplot(seedlings1, aes(x = eff_sd_no, fill = as.factor(Density))) + 
  geom_histogram(binwidth = 50)

(of course, stacked histograms aren’t great, but it gives the general idea).

Now look at the mean and variance across pots:

kable(group_by(seedlings1, Density) %>% 
        summarise(Mean = mean(eff_sd_no), Variance = var(eff_sd_no)),
      caption = paste("Mean and variance across pots of effective seed number",
                      "in treatments B and C of Ler generation 1")
      )
Mean and variance across pots of effective seed number in treatments B and C of Ler generation 1
Density Mean Variance
1 108.4 1020.267
50 261.4 10440.489

We can see a number of things from this:

  • For isolated plants, the variance is about 10 times the mean. Compared with Poisson, this is huge!
  • As we go from a density of 1 to 50, the seeds per plant drops by a factor of 20, from about 100 to about 5.
  • If the among-pot variance at high density was caused by iid variation in individual seed production, the among individual variance would need to be 10440/50, or about 200.
    • This is a 4-fold increase in the variance:mean ratio
    • To acheive it, we would need, on average, one individual producing the average isolated-plant seed number (108) and the other 49 producing zero. While I am sure that some plants produce zero seeds (including by dying), it seems unlikely that the hierarchy is this strong.

It would be great to have more info on silique number and on height. However, we only have silique number for treatment B (solitary plants); the measurement for treatment A seems to be after thinning. And height is only available in generation 5, at which point it’s difficult to get unambiguous seed production/dispersal estimates (we can, however, look at non-dispersed seed production in the isolated pots).

Some additional analyses to do:

  • Look for a correlation between silique number and height in generation 5, treatment B. If greater fecundity is associated with greater height, then we’d expect more fecund plants to have better dispersal, which we can test in generation 0. Of course, we don’t have any way to extend this to the population.
  • Look at the correlation between silique number and effective seed number in G1B0p.
  • Look at whether silique number predicts the fraction staying in the home pot, and more generally how variable that fraction is. Again, though, I’m not sure how that will generalize. It just provides hints at mechanism.

Other thoughts:

  • To the extent that home-pot seeds is a useful proxy for total seeds, I can look at the isolated pots to see how fecundity varies with density and with generation (the latter to pick up seasonality and pests)
  • I’ve had other thoughts but my brain is full

21 April 2017

Data munging

Realizing that I need to re-use the seedlings1 data from last time, I see that I should start to use the munge directory for key derived data. I therefore ned to give it a more meaningful name; how about Ler_seeds_gen1. I’ll also add some more information about home and away

The code for doing this is in munge/01-Ler-seed-gen1.R

Ler fecundity

Reanalyze effective seed number

In the process of doing this I discovered that last time I failed to double the dispersing seeds in the effective seed number calculation. So let’s redo the graph and table.

Plot the distributions:

ggplot(Ler_seed_gen1, aes(x = eff_sd_no, fill = as.factor(Density))) + 
  geom_histogram(binwidth = 50)

(of course, stacked histograms aren’t great, but it gives the general idea).

Now look at the mean and variance across pots:

kable(group_by(Ler_seed_gen1, Density) %>% 
        summarise(Mean = mean(eff_sd_no), Variance = var(eff_sd_no)),
      caption = paste("Mean and variance across pots of effective seed number",
                      "in treatments B and C of Ler generation 1")
      )
Mean and variance across pots of effective seed number in treatments B and C of Ler generation 1
Density Mean Variance
1 133.9 3443.433
50 347.6 21362.933

This changes the quantitative mean:variance ratios:

  • For isolated plants, the variance is about 26 times the mean. Compared with Poisson, this is huge!
  • As we go from a density of 1 to 50, the seeds per plant drops by a factor of 19, from 134 to about 7.
  • If the among-pot variance at high density was caused by iid variation in individual seed production, the among individual variance would need to be 21363/50, or about 427.
    • This is a 2-fold increase in the variance:mean ratio
    • It seems that, to acheive this among-individual variance, almost all the seeds in the high density plots would need to come from one or two individuals–and whether the among-successful-individual variance required is consistent with what we see in isolated pots remains an open question. We would need to assume something like a lognormal distribution among individuals to get at this.

Home seeds as proxy

Now lets look at how well home seeds predicts total effective seeds:

qplot(home, eff_sd_no, data = Ler_seed_gen1, 
      group = as.factor(Density), colour = as.factor(Density),
      xlab = "Number of seedlings in home pot", 
      ylab = "Effective seed number"
      ) + 
  geom_smooth(method = "lm")

summary(lm(eff_sd_no ~ home * as.factor(Density), data = Ler_seed_gen1))

Call:
lm(formula = eff_sd_no ~ home * as.factor(Density), data = Ler_seed_gen1)

Residuals:
    Min      1Q  Median      3Q     Max 
-140.99  -40.94   -7.44   24.48  138.17 

Coefficients:
                          Estimate Std. Error t value Pr(>|t|)
(Intercept)                82.2439    80.5487   1.021    0.322
home                        0.5513     0.8223   0.670    0.512
as.factor(Density)50      -37.4497   104.2092  -0.359    0.724
home:as.factor(Density)50   0.8532     0.8708   0.980    0.342

Residual standard error: 74.29 on 16 degrees of freedom
Multiple R-squared:  0.8045,    Adjusted R-squared:  0.7678 
F-statistic: 21.94 on 3 and 16 DF,  p-value: 6.482e-06

There’s no evidence for differences among the two pots, so fortunately we can merge them:

qplot(home, eff_sd_no, data = Ler_seed_gen1, 
      xlab = "Number of seedlings in home pot", 
      ylab = "Effective seed number"
      ) + 
  geom_smooth(method = "lm")

summary(lm(eff_sd_no ~ home, data = Ler_seed_gen1))

Call:
lm(formula = eff_sd_no ~ home, data = Ler_seed_gen1)

Residuals:
     Min       1Q   Median       3Q      Max 
-139.659  -44.354   -4.624   31.966  147.853 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.9745    34.2927   0.116    0.909    
home          1.5310     0.1935   7.911 2.87e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 74.86 on 18 degrees of freedom
Multiple R-squared:  0.7766,    Adjusted R-squared:  0.7642 
F-statistic: 62.58 on 1 and 18 DF,  p-value: 2.873e-07

Fortunately the relationship is linear and independent of density (although with only two density treatments these conclustions are not super robust). And while we explain 78% of the variance there is still some scatter: the home pot seeds represents “measurement error” relative to the effective seed number (although the effective seed number itself seems pretty stochastic, as the fraction of seeds dispersing is pretty stochastic). However, it’s the best we have.

24 April 2017

DD Ler fecundity

We use the 3-pot gaps, looking at seeds in the home pot. The rationale is that a trivial number of seeds will be arriving from other pots.

Ler3pBC <- filter(popLer, Gap == "3p", Treatment != "A")
Ler3pBC <- group_by(Ler3pBC, Treatment, Pot, Rep) %>%
  mutate(Adults = lag(Seedlings)) 
Ler3pBC <- within(Ler3pBC,
                  {
                    Adults[Treatment == "B" & !is.na(Adults)] <- 1
                    Adults[Pot == 0 & Generation == 1] <- 1
                    Adults[Treatment == "C" & Pot == 0 & Generation == 1] <- 50
                  })

I couldn’t figure out how to do it all in dplyr!

I think there’s something wierd going on in generation 6, let’s look at treatment B:

filter(Ler3pBC, Treatment == "B") %>%
  ggplot(aes(x = as.factor(Generation), y = Seedlings)) + geom_boxplot(notch = TRUE) 
notch went outside hinges. Try setting notch=FALSE.
notch went outside hinges. Try setting notch=FALSE.
notch went outside hinges. Try setting notch=FALSE.

Actually, it’s the first and last generations that are odd: Gen 1 is unusually productive, and Gen 6 is unusually underproductive.

Let’s look at the density dependence:

qplot(data=Ler3pBC, x = Adults, y = Seedlings/Adults, 
      colour = as.factor(Generation), log = "xy") + 
  geom_smooth(method = "lm")
Warning: Removed 15 rows containing non-finite values (stat_smooth).
Warning: Removed 15 rows containing missing values (geom_point).

Generation 6 looks really different.

It’s a bit less pronounced if we leave out treatment B, but it’s still there:

qplot(data=filter(Ler3pBC, Treatment == "C"), x = Adults, y = Seedlings/Adults, 
      colour = as.factor(Generation), log = "xy") + 
  geom_smooth(method = "lm")
Warning: Removed 9 rows containing non-finite values (stat_smooth).
Warning: Removed 9 rows containing missing values (geom_point).

So I think we want to leave out gen 6.

Note that there looks to be some curvature in the log-log plot, suggesting that the log tramsform of Adults is too strong. Need to estimate a power transform instead, or the Wilkinson model.

26 April 2017

Calculate Ler spread rates

First we look at furthest distance reached.

Ler_furthest <- group_by(popLer, Treatment, Gap, Rep, Generation) %>%
  summarise(Furthest = max(Pot))
ggplot(Ler_furthest, aes(x = Generation, y = Furthest, color = as.factor(Rep))) +
  geom_line() + facet_grid(Treatment ~ Gap)

Hmm, I’ve uncovered that popLer has a line of all NA’s (and popLer_cm has three)! I’ve gone ahead and updated popLer.R (and added plyr to the set of autoloaded packages) to fix this. Now I don’t have extra rows/columns in the facet table!

Now let’s look at a threshold:

threshold <- 100
Ler_thresh <- filter(popLer, Seedlings > threshold) %>%
  group_by(Treatment, Gap, Rep, Generation) %>%
  summarise(Furthest = max(Pot))
ggplot(Ler_thresh, aes(x = Generation, y = Furthest, color = as.factor(Rep))) +
  geom_line() + facet_grid(Treatment ~ Gap)

Not surprisingly, this works best for for Treatment C.

mean(filter(Ler_furthest, Treatment == "C", Gap == "0p", Generation == 6)$Furthest)
[1] 14
var(filter(Ler_furthest, Treatment == "C", Gap == "0p", Generation == 6)$Furthest)
[1] 14.44444
mean(filter(Ler_thresh, Treatment == "C", Gap == "0p", Generation == 6)$Furthest)
[1] 12.3
var(filter(Ler_thresh, Treatment == "C", Gap == "0p", Generation == 6)$Furthest)
[1] 11.78889

The mean and variance of total distance travelled is slightly less in the threshold case, but that might all be attributed to generation 1 (effectively, there is a one-generation delay between inititial colonization and reaching the threshold, here set to 100 seetlings).

HMMPF Now I’m getting an error from load.project() that seems to derive from the 01-Ler-seed-gen1.R — except that the script works fine when I run it in the console!

4/28: Solved this—it had to do with loading both plyr and dplyr in the initialization file. I’ve taken out plyr, and explicitly used plyr::ddply() in the script that generates popLer.

1 May 2017

Sanity check

I’ve been farting around with data for a few weeks now. I think it has been useful, both in terms of getting me back into the data and code, and figuring out how to work in the bookdown/ProjectTemplate world, but I can easily get lost in here indefinitely.

So now it’s time to take stock. I also have to think about the talk I’m going to give in Bren in 2 (!) weeks.

So recall the central question: What are the drivers of variation in spread rate, and are they predictable?

There are two components:

  1. Variation and/or heterogeneity among pots in per-capita seed production
  2. Variation and/or heterogeneity among pots in dispersal

What we know from the analyses so far

  • In the single-plant pots, average home seed production is consistent across generations 2-5, but is high in gen 1 and low in gen 6.
  • In the single-plant pots, the variance in home seed production seems lowest in gens 1, 3, and 6, and highest in gen 2 (this is based on looking at the IQR from the boxplots; I haven’t actually calculated the variances yet).
  • In the population runs, the density-dependent pattern of seed production looks, in expectation, the same across all generations except 6, which runs low. I can’t actually measure the density-dependent curve in gen 1, but the observations at \(N=50\) are consistent with the curves from gens 2-5. I have not looked at whether the residual variance differs among generations.

Some remaining empirical questions

  1. The model for effective seed number vs home seed number is eff_sds = 4 + 1.5*home_sds, which is not quite a proportionality. What do we find if:
    1. We force the intercept to zero?
    2. We do a power law regression to see if the exponent is one?
  2. What is the among-generation pattern in residual seed production variance?
  3. What is the relationship between silique number and seed number?
  4. What is the relationship between the fraction dispersing and silique/seed number?
  5. What is the relationship between the fraction dispersing and the dispersal statistics (mean, variance, extreme of distances) of the dispersers?
  6. What is the relationship between height and silique/seed number?
  7. What is the relationship between height and dispersal? This will require some sort of model fitting, as we only have height in a late generation.
  8. Can we get useful estimates of size asymmetries by comparing tallest individual and “height of crowd”? If so, does this depend on density?
  9. Can we get a more general view of size hierarchies and the relationship between height and silique production from the RIL density experiments? Can these be easily translated to Ler?
  10. Is there any autocorrelation or temporal trend in spread rate?
  11. How does the variance in cumulative spread increase through time?
  12. Does spread depend on abundance/seed production/siliques in leading pot? In leading several pots?

I need to distinguish between the empirical patterns that I need to build into the model vs. those that I want the model to predict. The primary one of the latter is the variance in cumulative spread.

I also need to keep in mind that the goal is not necessarily to generate proximal explanations for all of the sources of variability, but to show how each of the sources of variability contributes to the variability in spread. In general, the types of variability are:

  • Among-individual variation in seed production, which can be divided into iid variation (“demographic stochasticity”) and heterogeneity
  • Among-pot variation in seed production, between reps, between pots within a rep, and between generations (“environmental stochasticity”)
  • Sampling variation in the fraction dispersing
  • Among-pot variation in the fraction dispersing (is there any way to get at this outside of the first generation?)
  • Sampling variation in the realized dispersal kernel
  • Among-pot variation in the estimated dispersal kernel

Some theory

It’s straightforward to show that, if the spread increment is iid, then the theoretical variance in cumulative spread grows linearly with time. Has this been noted in the literature? In finite realizations, there will be autocorrelation. A replicate that is ahead of the pack this year will tend to be ahead of the pack next year; and a collection of replicates that has higher than expected variance this year will tend to have higher than expected variance next year.

Two key questions are then:

  • To what extent are the growth increments iid?
    • Temporal positive (negative) autocorrelation would increase (decrease) the rate of increase in variance;
    • correlations among “replicates” (e.g., if they are at neighboring points on a front) also could be important.
    • Finally, the spread rate might not be stationary: Evolution could lead to trends in spread rate, and deviations from the asymptotic spatial structure would, in many cases, affect the expected spread rate.
  • How much of the variation in annual spread be explained by current characteristics of the population, and do those characteristics themselves have predictable dynamics (the latter is actually the feedback mechanism by which we might get autocorrelation)

Again, the key questions for prediction spread are: How much of the variation in annual spread is “inherently stochastic”? How large is that variance relative to the mean rate of spread? The CV of cumulative spread will grow with the square root of time, but that will be proportional to the CV of one-generation spread.

Preparing for my seminar

I want to do the first section on variability in spread, and second section on evolution

For the first section:

  • Elaborate on the theory above. TO DO: look for prior literature on this. Start with Melbourne & Hastings
  • Describe the Arabidopsis experiments . TO DO: ask Jenn if she has more pictures
  • Show patterns of spread in Ler treatment C. TO DO: calculated statistics of annual and cumulative spread; make pictures of populations through time
  • Show extent to which Ler fits simple model. TO DO: Calculate var vs. time; work out sampling statistics to show plausibility
  • Show which processes contribute to the variance in per-generation spread. TO DO: re-run my analyses from July 2015; maybe apply them to fragmented landscapes

For the second section:

  • Motivate with cane toad example
  • Explain spatial sorting vs. natural selection. TO DO: Make sure I understand the spatial sorting model
  • Show the Science paper results. TO DO: ask Jenn for code so I can remake Fig. 2 in something that’s visible on slides
  • Show patterns of variance in annual/cumulative spread. TO DO: calculate spread statistics for evolution experiment
  • Compare/contrast with other spread evolution experiments. TO DO: read and understand the competing experiments
  • Discuss future work:
    • Does genetic diversity and evolution affect predictability and variance components?
    • Can we use models to explain patterns of repeatability in evolution?
    • Can we use models to show how fast the evolution happened?

Updates

  • The script for fig. 2 of the Science paper is in ~/Dropbox/Arabidopsis/analysls/Science_Scripts&Data.
  • I’ve just emailed Jenn to ask for photos/slides.

3 May 2017

Literature on theory on stochastic spread

Melbourne & Hastings (2009)

Melbourne & Hastings (2009) has this brief review of theory (p. 1537):

Stochastic models are needed to assess uncer- tainty in the rate of spread because traditional deterministic models provide no information about variance. Analyses have shown that linear sto- chastic models have the same rate of spread as the equivalent deterministic model, whereas sto- chasticity slows spread in nonlinear (density- dependent) models (9, 10). Investigations of the variance of spread rate have largely relied on nu- merical simulations and suggest that variance be- tween realizations can be high, especially when birth rates are low, variation in birth rates between individuals is high, or dispersal includes infrequent long-distance events (11–15). This work sug- gests that great uncertainty in an invasion forecast can be expected because of inherent biological variability.

The rest of the models in the paper are simulation-based. Here are the references cited above:

  1. M. A. Lewis, J. Math. Biol. 41, 430 (2000).
  2. D. Mollison, Math. Biosci. 107, 255 (1991).
  3. J. S. Clark, M. Lewis, J. S. McLachlan, J. HilleRisLambers, Ecology 84, 1979 (2003).
  4. G. J. Gibson, C. A. Gilligan, A. Kleczkowski, Proc. R. Soc. London B Biol. Sci. 266, 1743 (1999).
  5. G. J. Gibson et al., Stat. Comput. 16, 391 (2006).
  6. M. A. Lewis, S. Pacala, J. Math. Biol. 41, 387 (2000).
  7. Analytic solutions for some special cases are provided in (14).

Skellam (1951)

Going back to Skellam (1951) (also cited by BM) it starts out looking promising as it is based on “random dispersal”—but in fact he is using microscale randomness to get to diffusion, and the description of spread is deterministic.

Lewis (2000)

Lewis (2000) studies an integro-difference framework: continuous space. His measure of spread is the “expectation velocity:”

This can be described as the speed of movement \(c\) of a point \(x_t\) beyond which the expected number of individuals \(n_e\) is fixed. Thus \(x_t\) is defined so that the integral of the expected density over the interval \((x_t,\infty)\) is \(n_e\). (p. 432)

He does note other studies have looked at furthest extent:

[Other definitions of spread rates are given by] The rate of change of the average location of the farthest forward individual with respect to time [6]. McKean [24] showed that, in certain linear stochastic processes, the distribution of the furthest forward individual can be modeled by a nonlinear deterministic reaction-diffusion equation (KPP or Fisher equation), which in turn exhibits traveling wave solutions.

The paper defies a quick read, but appears to be focused on the expected abundance at a location and the covariance in abundances between two locations (although the covariance might be in the location of pairs of ineividuals). I don’t see a straightforward way to connect this to our problem.

  1. Biggins, J.D.: How fast does a general branching random walk spread? In: Ath- reya, K.B. and Jagers, P. (eds) Classical and Modern Branching Processes, pp 19–39. Springer, 1998

24 McKean, H.P.: Application of Brownian motion to the equation of Kolmogorov- Petrovskii-Piscounov, Commun. Pure Appl. Math, 28, 323–331 (1975)

Mollison (1991)

This paper is primarily about showing how deterministic linear models can approximate the mean spread in stochastic and nonlinear models. The stochasticity is again at the micro scale: birth-death with Brownian movement of individuals.

Clark et al (2003)

This paper looks at long distance dispersal (LDD). It is mostly about estimating parameters from data, and describing the LDD model (although I think it mostly comes from prior papers by Clark). They say repeatedly that spread in this case is “erratic” and “unpredictable” but don’t quantify it. In three numerical examples (Fig. 2c, 4) they plot medians, 50% and 90% intervals for cumulative distance spread through time. It’s not clear whether these are based on full population models or just draws from the distribution of jump distances. The intervals grow though time, but are smooth only in the case that lacks LDD (Fig. 4b), suggesting that they are not using enough draws to really characterize the distribution. It’s hard to tell how the width of the intervals scales with time.

Possible sources of the LDD model:

  • Clark, J. S., M. Lewis, and L. Horvath. 2001c. Invasion by extremes: population spread with variation in dispersal and reproduction. American Naturalist 157:537–554.
  • Clark, J. S., M. Silman, R. Kern, E. Macklin, and J. HilleRisLambers. 1999. Seed dispersal near and far: gen- eralized patterns across temperate and tropical forests. Ecology 80:1475–1494.

Clark et al (1999) introduces the “2Dt” dispersal kernel. Clark et al (2001) shows how LDD affects mean speed, and that variability in reproduction in the furthest extremens model slows spread. They do not address the variance in spread, although it is illustrated in a numerical example (fig. 5).

Lewis & Pacala (2000)

This paper is a more extensive treatment of hte models in Lewis (2000). Basically, the covariance is of interest because it describes the shape of the wavefront and characterises the extent of patchiness. Variability in spread is not addressed that I can see.

Stochastic models of population spread have been widely studied in the context of the spread of an infection. An introduction to the early work in this area can be found in a review by Mollison [26].

  1. Denis Mollison. Spatial contact models for ecological and epidemic spread. J. R. Statist. Soc. B, 39(3), 283–326 (1977)

Summary

I haven’t read all the papers in BM yet, and I haven’t done any other literature search, but so far I haven’t found any characterizations of the variability in cumulative spread. Mark Lewis has a new (2016) book that will probably have anything that exists in the theory literature - I need to digest that next.

5 May 2017

Calculate Ler spread rates

First we look at furthest distance reached.

Ler_furthest_C <- filter(popLer, Treatment == "C") %>%
  group_by(Gap, Rep, Generation) %>%
  summarise(Furthest = max(Pot))

Now calculate per-generation spread rates

# The "default" argument to lag() replaces the leading NA with the specified value
gen_spread <- group_by(Ler_furthest_C, Gap, Rep) %>%
  mutate(speed =  Furthest - lag(Furthest, default = 0),
         speed_m1 = lag(speed))

Hmm, there is one case (rep 8, 3p gap) where the forward pot seems to go extinct, resulting in a negative speed. Need to check the original data, but for now let’s just set that to zero.

gen_spread <- within(gen_spread, speed[speed < 0] <- 0)
gen_spread <- within(gen_spread, speed_m1[speed_m1 < 0] <- 0)

Look at autocorrelation:

spread_AR <- group_by(gen_spread, Gap, Rep) %>% 
  summarise(AR1 = cor(speed, speed_m1, use = "complete"))
ggplot(spread_AR, aes(x = AR1)) + geom_histogram(aes(fill = Gap))
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

ALthough it looks like a bias towards negative values, I think that’s mostly from the 3p gaps, where the pattern 0-4-0 will be common.

Look at trends by rep:

ggplot(gen_spread, aes(x = Generation, y = speed, color = as.factor(Rep))) +
  geom_point(position = "jitter") + 
  geom_smooth(method = "lm", se = FALSE) + 
  facet_wrap(~Gap)

ggplot(gen_spread, aes(x = Generation, y = speed)) +
  geom_point(position = "jitter") + 
  geom_smooth() + 
  facet_wrap(~Gap)

ggplot(gen_spread, aes(x = Generation, y = speed)) +
  geom_point(position = "jitter") + 
  geom_smooth(method = "gam", method.args = list(k = 4)) + 
  facet_wrap(~Gap)

summary(lm(speed ~ Generation, data = filter(gen_spread, Gap == "0p")))

Call:
lm(formula = speed ~ Generation, data = filter(gen_spread, Gap == 
    "0p"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.4705 -1.3090  0.0295  1.1224  3.5295 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.2933     0.4609   7.145 1.67e-09 ***
Generation   -0.2743     0.1183  -2.318    0.024 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.566 on 58 degrees of freedom
Multiple R-squared:  0.08476,   Adjusted R-squared:  0.06898 
F-statistic: 5.371 on 1 and 58 DF,  p-value: 0.02402
library(mgcv)
summary(gam(speed ~ s(Generation, k = 4), data = filter(gen_spread, Gap == "0p")))

Family: gaussian 
Link function: identity 

Formula:
speed ~ s(Generation, k = 4)

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   2.3333     0.1828   12.77   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                edf Ref.df     F  p-value    
s(Generation) 2.223  2.597 7.506 0.000784 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.239   Deviance explained = 26.7%
GCV = 2.1184  Scale est. = 2.0046    n = 60
plot(gam(speed ~ s(Generation, k = 4), data = filter(gen_spread, Gap == "0p")))

anova(gam(speed ~ s(Generation, k = 4), data = filter(gen_spread, Gap == "0p")), 
      gam(speed ~ Generation, data = filter(gen_spread, Gap == "0p")),
      test = "Chisq") 
Analysis of Deviance Table

Model 1: speed ~ s(Generation, k = 4)
Model 2: speed ~ Generation
  Resid. Df Resid. Dev      Df Deviance  Pr(>Chi)    
1    56.403     113.82                               
2    58.000     142.17 -1.5975  -28.351 0.0004792 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

So there looks to be some nonstationarity in the continuous runways, with the speed peaking in generation 3. The initial increase makes sense, but the decline does not.

8 May 2027

Dispersal evolution in beetle experiments

Just read the papers from Brett’s and TEX’s labs. As Jen said, their “no evolution” treatment was spatial shuffling, so didn’t maintain the original genetic composition.

They both found increased mean and increased variance with evolution. Both found increased dispersal ability at the edge of the evolving populations; the flour beetle experiment also found a reduction in low-density growth rate.

The effects on dispersal are explained as spatial sorting; the growth rate effects are hypothesized to represent “gene surfing”, whereby deleterious alleles exposed by founder effects at the leading edge are carried forward by the fact that next generation’s leading edge is primarily founded by this generation’s leading edge.

The flour beetle experiment found that the among-replicate variance in cumulative spread grew faster than linearly. This is consistent with among-replicate differences in mean speeds (which would generate a quadratic growth in variance through time). This pattern was found in both treatments, suggesting that it’s not just a consequence of founder effects at the leading edge. Perhaps it represents founder effects in the initial establishment from the stock population (each replicate was founded by 20 individuals). The bean beetle experiment doesn’t show this graph but found strong support for a random effect for replicate speed in their model for cumulative spread.

So a couple of things to do (both for Ler and the evolution experiments):

  • Plot mean and variance of cumulative spread against time
  • Fit the bean beetle model (as a first approximation to be quick I can treat rep as a fixed effect).

Ler cumulative spread

To facilitate the above analyses, I have modified the script for popLer to make categorical versions of Generation and Replicate (Gen and Rep, respectively), and put the guts of last time’s data manipulation into a munge script, which creates LerC_spread containing cumulative and per-generation spread for the C treatment replicates.

Let’s calculate the means and variances of cumulative spread:

cum_spread_stats <- group_by(LerC_spread, Gap, Gen, Generation) %>%
  summarise(mean_spread = mean(Furthest),
            var_spread = var(Furthest),
            CV_spread = sqrt(var_spread)/mean_spread
  )

And now plot the results:

ggplot(aes(x = Generation, y = mean_spread, color = Gap), data = cum_spread_stats) +
  geom_point() + geom_smooth(method = "lm")

ggplot(aes(x = Generation, y = var_spread, color = Gap), data = cum_spread_stats) +
  geom_point() + geom_line()

ggplot(aes(x = Generation, y = CV_spread, color = Gap), data = cum_spread_stats) +
  geom_point() + geom_line()
Warning: Removed 1 rows containing missing values (geom_point).
Warning: Removed 1 rows containing missing values (geom_path).

The linear approximation to mean spread is pretty good, although we see a dip in generation 6, as expected from the reduced seed production, and even without gen 6, the continuous runways seem to be decelerating. The ranking between landscapes makes sense. The variances are rather more over the map, although we don’t have confidence intervals on them. But exept for 1-pot gaps, we can squint and imagine that the variances are linear in time. Note, however, that with only 6 generations we’re not going to easily detect nonlinearities. I don’t know that the CVs tell us much.

Probably the best way to get CIs is to bootstrap replicates.

Let’s try fitting a model. We’ll need to add a lag of cumulative spread.

LerC_spread <- mutate(LerC_spread, 
                      Furthestm1 = lag(Furthest, default = 0))
m1 <- lm(Furthest ~ Generation + Rep + Generation:Rep + Furthestm1, 
         data = filter(LerC_spread, Gap == "0p"))
summary(m1)

Call:
lm(formula = Furthest ~ Generation + Rep + Generation:Rep + Furthestm1, 
    data = filter(LerC_spread, Gap == "0p"))

Residuals:
     Min       1Q   Median       3Q      Max 
-2.08017 -0.73937  0.03383  0.78409  2.54087 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)      -0.26479    1.38823  -0.191 0.849720    
Generation        2.22470    0.58884   3.778 0.000528 ***
Rep2              0.42109    1.81056   0.233 0.817308    
Rep3              1.70697    1.81500   0.940 0.352762    
Rep4             -0.36309    1.82680  -0.199 0.843484    
Rep5              1.94739    1.80616   1.078 0.287571    
Rep6              0.91400    1.80930   0.505 0.616284    
Rep7              0.73685    1.80286   0.409 0.684987    
Rep8             -1.47533    1.86173  -0.792 0.432890    
Rep9              1.19824    1.80269   0.665 0.510156    
Rep10             0.63691    1.82680   0.349 0.729229    
Furthestm1        0.02636    0.21151   0.125 0.901446    
Generation:Rep2   0.15787    0.47548   0.332 0.741652    
Generation:Rep3  -0.81727    0.47167  -1.733 0.091045 .  
Generation:Rep4   1.08491    0.51942   2.089 0.043312 *  
Generation:Rep5   0.51499    0.51405   1.002 0.322606    
Generation:Rep6  -0.58945    0.47054  -1.253 0.217770    
Generation:Rep7  -0.25940    0.46323  -0.560 0.578691    
Generation:Rep8   1.45031    0.54307   2.671 0.010986 *  
Generation:Rep9  -0.20452    0.46429  -0.440 0.662009    
Generation:Rep10  0.65634    0.51942   1.264 0.213878    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.369 on 39 degrees of freedom
Multiple R-squared:  0.9503,    Adjusted R-squared:  0.9248 
F-statistic: 37.25 on 20 and 39 DF,  p-value: < 2.2e-16
car::Anova(m1)
Anova Table (Type II tests)

Response: Furthest
               Sum Sq Df F value  Pr(>F)  
Generation      5.178  1  2.7619 0.10455  
Rep            32.986  9  1.9550 0.07214 .
Furthestm1      0.029  1  0.0155 0.90145  
Generation:Rep 35.750  9  2.1188 0.05125 .
Residuals      73.114 39                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This is a little curious– I wasn’t expecting the reps to have different speeds, except as a side effect of autocorrelation. Let’s try a few other models:

m2 <- lm(Furthest ~ Generation + Rep + Generation:Rep, 
         data = filter(LerC_spread, Gap == "0p"))
m3 <- lm(Furthest ~ Generation + Furthestm1, data = filter(LerC_spread, Gap == "0p"))
anova(m1, m2, m3)
Analysis of Variance Table

Model 1: Furthest ~ Generation + Rep + Generation:Rep + Furthestm1
Model 2: Furthest ~ Generation + Rep + Generation:Rep
Model 3: Furthest ~ Generation + Furthestm1
  Res.Df     RSS  Df Sum of Sq      F  Pr(>F)  
1     39  73.114                               
2     40  73.143  -1    -0.029 0.0155 0.90145  
3     57 141.849 -17   -68.707 2.1558 0.02383 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Hmm, they really do seem to have different slopes. But we have not yet accounted for time-varying speeds. I don’t see how to do that in this context; let’s look at per-generation speeds instead:

m11 <- lm(speed ~ Gen + Rep, data = filter(LerC_spread, Gap == "0p"))
summary(m11)

Call:
lm(formula = speed ~ Gen + Rep, data = filter(LerC_spread, Gap == 
    "0p"))

Residuals:
    Min      1Q  Median      3Q     Max 
-3.0333 -0.7083 -0.0500  0.8667  2.5333 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
(Intercept)  1.733e+00  6.973e-01   2.486   0.0167 *
Gen2         1.400e+00  6.237e-01   2.245   0.0298 *
Gen3         1.300e+00  6.237e-01   2.084   0.0428 *
Gen4         7.000e-01  6.237e-01   1.122   0.2677  
Gen5         4.000e-01  6.237e-01   0.641   0.5246  
Gen6        -1.200e+00  6.237e-01  -1.924   0.0607 .
Rep2         7.525e-16  8.052e-01   0.000   1.0000  
Rep3        -6.667e-01  8.052e-01  -0.828   0.4121  
Rep4         1.000e+00  8.052e-01   1.242   0.2207  
Rep5         6.667e-01  8.052e-01   0.828   0.4121  
Rep6        -5.000e-01  8.052e-01  -0.621   0.5378  
Rep7        -3.333e-01  8.052e-01  -0.414   0.6809  
Rep8         1.167e+00  8.052e-01   1.449   0.1543  
Rep9        -1.667e-01  8.052e-01  -0.207   0.8370  
Rep10        5.000e-01  8.052e-01   0.621   0.5378  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.395 on 45 degrees of freedom
Multiple R-squared:  0.4365,    Adjusted R-squared:  0.2612 
F-statistic:  2.49 on 14 and 45 DF,  p-value: 0.01046
car::Anova(m11)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value   Pr(>F)   
Gen       46.133  5  4.7433 0.001451 **
Rep       21.667  9  1.2376 0.297021   
Residuals 87.533 45                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Well this is confusing! If the per-generation speeds don’t differ among reps, how can the slopes of cumulative spread do so? Let’s look at some plots:

ggplot(aes(x = Generation, y = Furthest, color = Rep), 
       data = filter(LerC_spread, Gap == "0p")) +
  geom_point() + geom_smooth(method = "lm")

ggplot(aes(x = Generation, y = speed, color = Rep), 
       data = filter(LerC_spread, Gap == "0p")) +
  geom_line()

Let’s look for autocorrelation in speed.

m12 <- lm(speed ~ Gen + Rep + speed_m1, data = filter(LerC_spread, Gap == "0p"))
summary(m12)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(LerC_spread, 
    Gap == "0p"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.9931 -0.5869 -0.0863  0.7257  2.5069 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.6162     0.7801   4.635  4.8e-05 ***
Gen3          0.3577     0.6601   0.542 0.591277    
Gen4         -0.2750     0.6549  -0.420 0.677148    
Gen5         -0.7711     0.6313  -1.222 0.230033    
Gen6         -2.4692     0.6246  -3.953 0.000357 ***
Rep2          0.2654     0.8793   0.302 0.764579    
Rep3         -0.7962     0.8839  -0.901 0.373882    
Rep4          1.5962     0.8839   1.806 0.079554 .  
Rep5          0.7270     0.8930   0.814 0.421127    
Rep6         -0.7308     0.8810  -0.829 0.412469    
Rep7         -0.2654     0.8793  -0.302 0.764579    
Rep8          1.7923     0.8992   1.993 0.054082 .  
Rep9         -0.4000     0.8787  -0.455 0.651778    
Rep10         0.8616     0.8879   0.970 0.338540    
speed_m1     -0.3270     0.1591  -2.055 0.047378 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.389 on 35 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.5314,    Adjusted R-squared:  0.3439 
F-statistic: 2.835 on 14 and 35 DF,  p-value: 0.006219
car::Anova(m12)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value    Pr(>F)    
Gen       47.871  4  6.1995 0.0007015 ***
Rep       32.146  9  1.8502 0.0935244 .  
speed_m1   8.154  1  4.2240 0.0473783 *  
Residuals 67.566 35                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Aha! there is negative autocorrelation in speed (evidently it was obscured by the time-dependent patterns in the univariate analysis I did last week). A potential explanation for this could be partial “pushing” by older pots, so that a replicate that gets extra far this generation has less pushing next generation. Alternatively, it could reflect a correlation between distance spread and the number of plants in the furthest pot.

Let’s do one more analysis with a quadratic of time, so we can interact it with Rep:

m13 <- lm(speed ~ poly(Generation, 2) * Rep + speed_m1, 
          data = filter(LerC_spread, Gap == "0p"))
summary(m13)

Call:
lm(formula = speed ~ poly(Generation, 2) * Rep + speed_m1, data = filter(LerC_spread, 
    Gap == "0p"))

Residuals:
     Min       1Q   Median       3Q      Max 
-2.14668 -0.47051  0.02562  0.52439  2.46495 

Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
(Intercept)                 3.054e+00  9.512e-01   3.211   0.0046 **
poly(Generation, 2)1        1.990e+00  7.837e+00   0.254   0.8023   
poly(Generation, 2)2       -2.395e+00  7.431e+00  -0.322   0.7507   
Rep2                        4.945e-01  1.154e+00   0.429   0.6730   
Rep3                        1.295e-01  1.159e+00   0.112   0.9122   
Rep4                        2.280e+00  1.154e+00   1.976   0.0629 . 
Rep5                        6.689e-01  1.166e+00   0.574   0.5730   
Rep6                       -6.265e-01  1.150e+00  -0.545   0.5922   
Rep7                        1.540e-02  1.160e+00   0.013   0.9895   
Rep8                        1.467e+00  1.155e+00   1.270   0.2194   
Rep9                       -9.598e-01  1.150e+00  -0.835   0.4142   
Rep10                       8.141e-01  1.148e+00   0.709   0.4869   
speed_m1                   -4.215e-01  1.982e-01  -2.126   0.0468 * 
poly(Generation, 2)1:Rep2  -1.014e+01  1.115e+01  -0.910   0.3745   
poly(Generation, 2)2:Rep2  -5.019e+00  1.087e+01  -0.462   0.6495   
poly(Generation, 2)1:Rep3  -1.910e+01  1.109e+01  -1.722   0.1012   
poly(Generation, 2)2:Rep3   7.551e+00  1.057e+01   0.714   0.4838   
poly(Generation, 2)1:Rep4  -1.356e+01  1.122e+01  -1.208   0.2417   
poly(Generation, 2)2:Rep4   3.330e+00  1.155e+01   0.288   0.7762   
poly(Generation, 2)1:Rep5  -3.324e+00  1.110e+01  -0.299   0.7678   
poly(Generation, 2)2:Rep5  -8.064e+00  1.046e+01  -0.771   0.4502   
poly(Generation, 2)1:Rep6  -3.761e+00  1.109e+01  -0.339   0.7382   
poly(Generation, 2)2:Rep6   5.375e-14  1.044e+01   0.000   1.0000   
poly(Generation, 2)1:Rep7  -1.072e+01  1.111e+01  -0.965   0.3467   
poly(Generation, 2)2:Rep7  -3.057e+00  1.080e+01  -0.283   0.7801   
poly(Generation, 2)1:Rep8   5.176e+00  1.121e+01   0.462   0.6496   
poly(Generation, 2)2:Rep8  -7.049e+00  1.053e+01  -0.669   0.5114   
poly(Generation, 2)1:Rep9   2.805e+00  1.108e+01   0.253   0.8028   
poly(Generation, 2)2:Rep9  -1.315e+01  1.057e+01  -1.244   0.2287   
poly(Generation, 2)1:Rep10 -6.363e+00  1.115e+01  -0.571   0.5750   
poly(Generation, 2)2:Rep10 -1.054e+01  1.087e+01  -0.970   0.3444   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.43 on 19 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.7304,    Adjusted R-squared:  0.3048 
F-statistic: 1.716 on 30 and 19 DF,  p-value: 0.1102
car::Anova(m13)
Anova Table (Type II tests)

Response: speed
                        Sum Sq Df F value    Pr(>F)    
poly(Generation, 2)     46.959  2 11.4785 0.0005389 ***
Rep                     32.487  9  1.7646 0.1422649    
speed_m1                 9.249  1  4.5217 0.0467938 *  
poly(Generation, 2):Rep 29.613 18  0.8043 0.6761230    
Residuals               38.865 19                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Drop the interaction:

m14 <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
          data = filter(LerC_spread, Gap == "0p"))
summary(m14)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(LerC_spread, 
    Gap == "0p"))

Residuals:
     Min       1Q   Median       3Q      Max 
-2.89320 -0.68615 -0.06304  0.68486  2.49198 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)            2.9649     0.7018   4.225  0.00015 ***
poly(Generation, 2)1  -4.2553     2.4272  -1.753  0.08786 .  
poly(Generation, 2)2  -5.5072     2.3684  -2.325  0.02565 *  
Rep2                   0.2662     0.8609   0.309  0.75893    
Rep3                  -0.7985     0.8653  -0.923  0.36206    
Rep4                   1.5985     0.8653   1.847  0.07269 .  
Rep5                   0.7309     0.8738   0.836  0.40831    
Rep6                  -0.7323     0.8626  -0.849  0.40133    
Rep7                  -0.2662     0.8609  -0.309  0.75893    
Rep8                   1.7970     0.8797   2.043  0.04824 *  
Rep9                  -0.4000     0.8604  -0.465  0.64473    
Rep10                  0.8647     0.8690   0.995  0.32619    
speed_m1              -0.3309     0.1526  -2.169  0.03660 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.36 on 37 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.5251,    Adjusted R-squared:  0.371 
F-statistic: 3.409 on 12 and 37 DF,  p-value: 0.002013
car::Anova(m14)
Anova Table (Type II tests)

Response: speed
                    Sum Sq Df F value    Pr(>F)    
poly(Generation, 2) 46.959  2 12.6865 6.372e-05 ***
Rep                 32.487  9  1.9504   0.07456 .  
speed_m1             8.705  1  4.7036   0.03660 *  
Residuals           68.478 37                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

So, in conclusion: weak evidence for among-Rep differences in means, but strong evidence for negative autocorrelation. Note that this autocorrelation should act to slow the rate at which the variance in cumulative spread increases with time.

Let’s check out the other landscapes:

m24 <- lm(speed ~ Gen + Rep + speed_m1, 
          data = filter(LerC_spread, Gap == "1p"))
m34 <- lm(speed ~ Gen + Rep + speed_m1, 
          data = filter(LerC_spread, Gap == "2p"))
m44 <- lm(speed ~ Gen + Rep + speed_m1, 
          data = filter(LerC_spread, Gap == "3p"))
summary(m24)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(LerC_spread, 
    Gap == "1p"))

Residuals:
    Min      1Q  Median      3Q     Max 
-1.9393 -1.0203 -0.2807  0.7435  3.0813 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
(Intercept)  1.939e+00  8.046e-01   2.410   0.0213 *
Gen3        -6.966e-01  6.728e-01  -1.035   0.3076  
Gen4        -6.345e-01  6.671e-01  -0.951   0.3481  
Gen5        -2.000e-01  6.664e-01  -0.300   0.7658  
Gen6        -1.131e+00  6.692e-01  -1.690   0.0999 .
Rep2         1.379e-01  9.504e-01   0.145   0.8855  
Rep3        -1.100e-15  9.424e-01   0.000   1.0000  
Rep4        -1.008e-15  9.424e-01   0.000   1.0000  
Rep5         9.379e-01  9.504e-01   0.987   0.3305  
Rep6         4.000e-01  9.424e-01   0.424   0.6738  
Rep7        -3.310e-01  9.444e-01  -0.351   0.7280  
Rep8         4.690e-01  9.444e-01   0.497   0.6226  
Rep9         4.000e-01  9.424e-01   0.424   0.6738  
Rep10        1.007e+00  9.604e-01   1.048   0.3017  
speed_m1    -1.724e-01  1.543e-01  -1.117   0.2715  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.49 on 35 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.1932,    Adjusted R-squared:  -0.1295 
F-statistic: 0.5988 on 14 and 35 DF,  p-value: 0.8474
summary(m34)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(LerC_spread, 
    Gap == "2p"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.9139 -0.8927 -0.3292  0.9515  3.3861 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
(Intercept)  2.206e+00  8.573e-01   2.573   0.0145 *
Gen3        -3.000e-01  7.061e-01  -0.425   0.6736  
Gen4        -5.386e-02  7.076e-01  -0.076   0.9398  
Gen5         3.010e-16  7.061e-01   0.000   1.0000  
Gen6        -6.000e-01  7.061e-01  -0.850   0.4013  
Rep2         1.077e-01  1.003e+00   0.107   0.9151  
Rep3        -1.200e+00  9.986e-01  -1.202   0.2376  
Rep4        -1.200e+00  9.986e-01  -1.202   0.2376  
Rep5        -1.308e+00  1.003e+00  -1.304   0.2007  
Rep6        -1.308e+00  1.003e+00  -1.304   0.2007  
Rep7         7.077e-01  1.003e+00   0.706   0.4850  
Rep8        -1.200e+00  9.986e-01  -1.202   0.2376  
Rep9        -2.015e+00  1.015e+00  -1.985   0.0550 .
Rep10       -4.923e-01  1.003e+00  -0.491   0.6266  
speed_m1    -1.795e-01  1.524e-01  -1.178   0.2469  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.579 on 35 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.2666,    Adjusted R-squared:  -0.02679 
F-statistic: 0.9087 on 14 and 35 DF,  p-value: 0.558
summary(m44)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(LerC_spread, 
    Gap == "3p"))

Residuals:
    Min      1Q  Median      3Q     Max 
-1.8047 -0.8358 -0.5528  0.2732  3.5512 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept) -3.150e-03  8.867e-01  -0.004    0.997
Gen3        -2.961e-01  7.572e-01  -0.391    0.698
Gen4         8.520e-01  7.464e-01   1.141    0.261
Gen5        -1.921e-01  7.991e-01  -0.240    0.811
Gen6        -3.480e-01  7.464e-01  -0.466    0.644
Rep2         9.039e-01  1.061e+00   0.852    0.400
Rep3         9.039e-01  1.061e+00   0.852    0.400
Rep4         1.808e+00  1.091e+00   1.657    0.106
Rep5         9.039e-01  1.061e+00   0.852    0.400
Rep6         9.039e-01  1.061e+00   0.852    0.400
Rep7         8.000e-01  1.050e+00   0.762    0.451
Rep8         9.039e-01  1.061e+00   0.852    0.400
Rep9         9.039e-01  1.061e+00   0.852    0.400
Rep10       -7.436e-17  1.050e+00   0.000    1.000
speed_m1    -1.299e-01  1.842e-01  -0.705    0.485

Residual standard error: 1.661 on 35 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.1824,    Adjusted R-squared:  -0.1447 
F-statistic: 0.5575 on 14 and 35 DF,  p-value: 0.879
car::Anova(m24)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value Pr(>F)
Gen        7.794  4  0.8776 0.4873
Rep        7.984  9  0.3996 0.9268
speed_m1   2.772  1  1.2485 0.2715
Residuals 77.708 35               
car::Anova(m34)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value Pr(>F)
Gen        2.708  4  0.2715 0.8943
Rep       28.733  9  1.2805 0.2819
speed_m1   3.458  1  1.3869 0.2469
Residuals 87.262 35               
car::Anova(m44)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value Pr(>F)
Gen        9.585  4  0.8687 0.4924
Rep       10.624  9  0.4279 0.9109
speed_m1   1.372  1  0.4974 0.4853
Residuals 96.548 35               

Nothing to see here, folks!

The pattern of autocorrelation should, I think, fall into the category of something that we want the model to reproduce.

10 May 2027

Spread patterns in evolution experiments

We’ll repeat the analysis with the newly loaded RIL data. Let’s calculate the means and variances of cumulative spread:

cum_spread_stats <- group_by(RIL_spread, Treatment, Gap, Gen, Generation) %>%
  summarise(mean_spread = mean(Furthest),
            var_spread = var(Furthest),
            CV_spread = sqrt(var_spread)/mean_spread
  )

And now plot the results:

ggplot(aes(x = Generation, y = mean_spread, color = Treatment), data = cum_spread_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = var_spread, color = Treatment), data = cum_spread_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = CV_spread, color = Treatment), data = cum_spread_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)
Warning: Removed 1 rows containing missing values (geom_point).

The patterns in the mean are probably not overly distinguishable from linear, although I’d want to see CIs.

Let’s look at per-generation spread.

speed_stats <- group_by(RIL_spread, Treatment, Gap, Gen, Generation) %>%
  summarise(mean_spread = mean(speed),
            var_spread = var(speed),
            CV_spread = sqrt(var_spread)/mean_spread
  )

And now plot the results:

ggplot(aes(x = Generation, y = mean_spread, color = Treatment), data = speed_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = var_spread, color = Treatment), data = speed_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = CV_spread, color = Treatment), data = speed_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)
Warning: Removed 6 rows containing missing values (geom_point).
Warning: Removed 1 rows containing missing values (geom_path).

In 0p and 1p, the treatments look very similar. But I bet there will be differences in the autocorrelation structure. Let’s fit some models.

m1_0NO <- lm(speed ~ Gen + Rep + speed_m1, data = filter(RIL_spread, Gap == "0p", Treatment == "NO"))
summary(m1_0NO)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "0p", Treatment == "NO"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.3398 -0.9501  0.0388  0.7904  3.5296 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)   2.3567     0.7471   3.154   0.0029 **
Gen3         -0.2490     0.6206  -0.401   0.6902   
Gen4         -0.4796     0.6155  -0.779   0.4401   
Gen5          0.1000     0.6146   0.163   0.8715   
Gen6          1.0612     0.6232   1.703   0.0957 . 
Gen7         -0.2470     0.6669  -0.370   0.7129   
Rep2         -0.2007     0.7955  -0.252   0.8020   
Rep3          0.2347     0.8017   0.293   0.7711   
Rep4          0.6667     0.7934   0.840   0.4053   
Rep5          1.4013     0.8017   1.748   0.0874 . 
Rep6         -0.4660     0.7955  -0.586   0.5610   
Rep7          0.2007     0.7955   0.252   0.8020   
Rep8          0.7177     0.7981   0.899   0.3734   
Rep9          0.9353     0.8119   1.152   0.2556   
Rep10         1.0850     0.8063   1.346   0.1853   
speed_m1     -0.1020     0.1726  -0.591   0.5575   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.374 on 44 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.2873,    Adjusted R-squared:  0.04432 
F-statistic: 1.182 on 15 and 44 DF,  p-value: 0.3201
car::Anova(m1_0NO)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value Pr(>F)
Gen       14.832  5  1.5709 0.1882
Rep       18.390  9  1.0820 0.3949
speed_m1   0.660  1  0.3493 0.5575
Residuals 83.090 44               
m1_0YES <- lm(speed ~ Gen + Rep + speed_m1, data = filter(RIL_spread, Gap == "0p", Treatment == "YES"))
summary(m1_0YES)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "0p", Treatment == "YES"))

Residuals:
    Min      1Q  Median      3Q     Max 
-4.4242 -0.9333 -0.0096  0.7844  9.4123 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)   3.1877     1.2650   2.520   0.0154 * 
Gen3          0.8602     1.0866   0.792   0.4328   
Gen4          0.3393     1.0909   0.311   0.7572   
Gen5          3.4000     1.0862   3.130   0.0031 **
Gen6          0.4369     1.1807   0.370   0.7131   
Gen7         -0.5796     1.0877  -0.533   0.5968   
Rep2         -1.2330     1.4031  -0.879   0.3843   
Rep3         -0.7330     1.4031  -0.522   0.6040   
Rep4         -0.9329     1.4041  -0.664   0.5099   
Rep5         -0.6667     1.4023  -0.475   0.6368   
Rep6         -0.1667     1.4023  -0.119   0.9059   
Rep7         -0.5663     1.4031  -0.404   0.6884   
Rep8         -1.1998     1.4025  -0.856   0.3969   
Rep9         -0.4668     1.4025  -0.333   0.7408   
Rep10        -0.3333     1.4023  -0.238   0.8132   
speed_m1     -0.1990     0.1446  -1.376   0.1757   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.429 on 44 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:   0.32, Adjusted R-squared:  0.08825 
F-statistic: 1.381 on 15 and 44 DF,  p-value: 0.1992
car::Anova(m1_0YES)
Anova Table (Type II tests)

Response: speed
           Sum Sq Df F value  Pr(>F)  
Gen        96.161  5  3.2602 0.01369 *
Rep         9.104  9  0.1715 0.99602  
speed_m1   11.173  1  1.8940 0.17572  
Residuals 259.560 44                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Nothing here, really, except for a generation effect in the evolution treatment, which is in turn driven by the high mean in generation 5. We don’t have enough df to look at an interaction, let’s look at the distribution of values to see if it’s a single outlier:

filter(RIL_spread, Treatment == "YES", Gap == "0p", Generation == 5)$speed
 [1] 16  3  7  6  5  7  1  4  3  3

It’s rep 1 with a crazy jump. Could that be a data error? Let’s look at the time series:

filter(RIL_spread, Treatment == "YES", Gap == "0p", Rep == "1")$speed
[1]  2 -1  0  0 16  1  4
popRIL %>% filter(Treatment == "YES", Gap == "0p", Rep == "1") %>%
  select(Generation, Pot, Seedlings) %>%
  reshape(timevar = "Generation", direction = "wide", idvar = "Pot")
   Pot Seedlings.1 Seedlings.2 Seedlings.3 Seedlings.4 Seedlings.5
1    0         262         503         602         785         506
2    1          47         330         484         454         172
3    2           5          NA          NA          NA         298
13   3          NA          NA          NA          NA         236
14   4          NA          NA          NA          NA         176
15   5          NA          NA          NA          NA         519
16   6          NA          NA          NA          NA         235
17   7          NA          NA          NA          NA         340
18   8          NA          NA          NA          NA         322
19   9          NA          NA          NA          NA         346
20  10          NA          NA          NA          NA         346
21  11          NA          NA          NA          NA         340
22  12          NA          NA          NA          NA         345
23  13          NA          NA          NA          NA         208
24  14          NA          NA          NA          NA         117
25  15          NA          NA          NA          NA           1
26  17          NA          NA          NA          NA           1
43  16          NA          NA          NA          NA          NA
45  18          NA          NA          NA          NA          NA
65  19          NA          NA          NA          NA          NA
66  20          NA          NA          NA          NA          NA
67  22          NA          NA          NA          NA          NA
   Seedlings.6 Seedlings.7
1          880         271
2          601         166
3          845         249
13         992         332
14         628         225
15         788         388
16         526         571
17         778         285
18         630         563
19         490         384
20         733         268
21         632         312
22         602         368
23         281         133
24         826         194
25         152         266
26          96         355
43         644         136
45           4         251
65          NA          89
66          NA          15
67          NA           1

There is clearly missing data in generations 2-4! This also appears to have been fixed in the Science paper, as there is no replicate sitting at pot 2 for several generations.

I’ve emailed Jenn, but for now let’s look at it without Rep1

Let’s calculate the means and variances of cumulative spread:

cum_spread_stats <- filter(RIL_spread, !(Rep == "1" & Treatment == "YES" & Gap == "0p")) %>%
  group_by(Treatment, Gap, Gen, Generation) %>%
  summarise(mean_spread = mean(Furthest),
            var_spread = var(Furthest),
            CV_spread = sqrt(var_spread)/mean_spread
  )

And now plot the results:

ggplot(aes(x = Generation, y = mean_spread, color = Treatment), data = cum_spread_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = var_spread, color = Treatment), data = cum_spread_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = CV_spread, color = Treatment), data = cum_spread_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)
Warning: Removed 1 rows containing missing values (geom_point).

The patterns in the mean are probably not overly distinguishable from linear, although I’d want to see CIs.

Let’s look at per-generation spread.

speed_stats <- filter(RIL_spread, !(Rep == "1" & Treatment == "YES" & Gap == "0p")) %>%
  group_by(Treatment, Gap, Gen, Generation) %>%
  summarise(mean_spread = mean(speed),
            var_spread = var(speed),
            CV_spread = sqrt(var_spread)/mean_spread
  )

And now plot the results:

ggplot(aes(x = Generation, y = mean_spread, color = Treatment), data = speed_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = var_spread, color = Treatment), data = speed_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = CV_spread, color = Treatment), data = speed_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)
Warning: Removed 6 rows containing missing values (geom_point).
Warning: Removed 1 rows containing missing values (geom_path).

In 0p and 1p, the treatments look very similar. But I bet there will be differences in the autocorrelation structure. Let’s fit some models.

m1_0NO <- lm(speed ~ Gen + Rep + speed_m1, 
             data = filter(RIL_spread, Gap == "0p", Treatment == "NO"))
summary(m1_0NO)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "0p", Treatment == "NO"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.3398 -0.9501  0.0388  0.7904  3.5296 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)   2.3567     0.7471   3.154   0.0029 **
Gen3         -0.2490     0.6206  -0.401   0.6902   
Gen4         -0.4796     0.6155  -0.779   0.4401   
Gen5          0.1000     0.6146   0.163   0.8715   
Gen6          1.0612     0.6232   1.703   0.0957 . 
Gen7         -0.2470     0.6669  -0.370   0.7129   
Rep2         -0.2007     0.7955  -0.252   0.8020   
Rep3          0.2347     0.8017   0.293   0.7711   
Rep4          0.6667     0.7934   0.840   0.4053   
Rep5          1.4013     0.8017   1.748   0.0874 . 
Rep6         -0.4660     0.7955  -0.586   0.5610   
Rep7          0.2007     0.7955   0.252   0.8020   
Rep8          0.7177     0.7981   0.899   0.3734   
Rep9          0.9353     0.8119   1.152   0.2556   
Rep10         1.0850     0.8063   1.346   0.1853   
speed_m1     -0.1020     0.1726  -0.591   0.5575   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.374 on 44 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.2873,    Adjusted R-squared:  0.04432 
F-statistic: 1.182 on 15 and 44 DF,  p-value: 0.3201
car::Anova(m1_0NO)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value Pr(>F)
Gen       14.832  5  1.5709 0.1882
Rep       18.390  9  1.0820 0.3949
speed_m1   0.660  1  0.3493 0.5575
Residuals 83.090 44               
m1_0YES <- lm(speed ~ Gen + Rep + speed_m1, 
              data = filter(RIL_spread, Gap == "0p", Treatment == "YES", Rep != "1"))
summary(m1_0YES)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "0p", Treatment == "YES", Rep != "1"))

Residuals:
    Min      1Q  Median      3Q     Max 
-3.3389 -0.9048 -0.2336  0.9944  3.2949 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)  2.51572    0.90182   2.790  0.00812 **
Gen3         0.91819    0.78315   1.172  0.24814   
Gen4         0.37478    0.79621   0.471  0.64048   
Gen5         1.94748    0.78364   2.485  0.01735 * 
Gen6         0.08289    0.83462   0.099  0.92140   
Gen7        -1.19900    0.78447  -1.528  0.13448   
Rep3         0.50000    0.95896   0.521  0.60504   
Rep4         0.28939    0.95926   0.302  0.76450   
Rep5         0.58789    0.96017   0.612  0.54391   
Rep6         1.08789    0.96017   1.133  0.26412   
Rep7         0.66667    0.95896   0.695  0.49105   
Rep8         0.04394    0.95926   0.046  0.96369   
Rep9         0.79850    0.96168   0.830  0.41141   
Rep10        0.92122    0.96017   0.959  0.34325   
speed_m1    -0.26367    0.14451  -1.825  0.07573 . 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.661 on 39 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.3745,    Adjusted R-squared:  0.1499 
F-statistic: 1.668 on 14 and 39 DF,  p-value: 0.1038
car::Anova(m1_0YES)
Anova Table (Type II tests)

Response: speed
           Sum Sq Df F value   Pr(>F)   
Gen        49.133  5  3.5619 0.009509 **
Rep         6.751  8  0.3059 0.959291   
speed_m1    9.184  1  3.3291 0.075731 . 
Residuals 107.593 39                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There’s still an effect of Gen 5, but now it looks like just the peak of a quadratic pattern:

m2_0NO <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
             data = filter(RIL_spread, Gap == "0p", Treatment == "NO"))
summary(m2_0NO)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "0p", Treatment == "NO"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.8175 -0.9077 -0.0137  0.7040  3.0469 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)   
(Intercept)            2.4458     0.7068   3.460  0.00116 **
poly(Generation, 2)1   1.8290     2.2192   0.824  0.41402   
poly(Generation, 2)2  -0.1183     2.2043  -0.054  0.95744   
Rep2                  -0.2185     0.8279  -0.264  0.79295   
Rep3                   0.2704     0.8340   0.324  0.74719   
Rep4                   0.6667     0.8258   0.807  0.42359   
Rep5                   1.4371     0.8340   1.723  0.09143 . 
Rep6                  -0.4481     0.8279  -0.541  0.59087   
Rep7                   0.2185     0.8279   0.264  0.79295   
Rep8                   0.7445     0.8304   0.896  0.37456   
Rep9                   0.9890     0.8440   1.172  0.24723   
Rep10                  1.1297     0.8385   1.347  0.18437   
speed_m1              -0.1556     0.1743  -0.893  0.37649   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.43 on 47 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.1751,    Adjusted R-squared:  -0.03547 
F-statistic: 0.8316 on 12 and 47 DF,  p-value: 0.6182
car::Anova(m2_0NO)
Anova Table (Type II tests)

Response: speed
                    Sum Sq Df F value Pr(>F)
poly(Generation, 2)  1.757  2  0.4294 0.6534
Rep                 19.278  9  1.0469 0.4185
speed_m1             1.631  1  0.7972 0.3765
Residuals           96.165 47               
m2_0YES <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
              data = filter(RIL_spread, Gap == "0p", Treatment == "YES", Rep != "1"))
summary(m2_0YES)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "0p", Treatment == "YES", Rep != "1"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.7428 -0.9827 -0.2568  0.8754  3.4194 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)   
(Intercept)           2.71330    0.78203   3.470  0.00122 **
poly(Generation, 2)1  1.68876    2.66425   0.634  0.52961   
poly(Generation, 2)2 -8.22899    2.61909  -3.142  0.00307 **
Rep3                  0.50000    0.98302   0.509  0.61367   
Rep4                  0.28026    0.98328   0.285  0.77703   
Rep5                  0.60615    0.98407   0.616  0.54124   
Rep6                  1.10615    0.98407   1.124  0.26737   
Rep7                  0.66667    0.98302   0.678  0.50137   
Rep8                  0.05308    0.98328   0.054  0.95721   
Rep9                  0.82590    0.98538   0.838  0.40669   
Rep10                 0.93949    0.98407   0.955  0.34519   
speed_m1             -0.31846    0.13641  -2.335  0.02442 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.703 on 42 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.2921,    Adjusted R-squared:  0.1067 
F-statistic: 1.576 on 11 and 42 DF,  p-value: 0.1417
car::Anova(m2_0YES)
Anova Table (Type II tests)

Response: speed
                     Sum Sq Df F value   Pr(>F)   
poly(Generation, 2)  34.970  2  6.0315 0.004981 **
Rep                   7.029  8  0.3031 0.960710   
speed_m1             15.799  1  5.4499 0.024425 * 
Residuals           121.756 42                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In the evolution treatment, strong quadratic effect of generation and negative autocorrelation. Nothing in the no evolution treatment!

Let’s look at the rest of the landscapes.

m2_0NO <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
             data = filter(RIL_spread, Gap == "1p", Treatment == "NO"))
summary(m2_0NO)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "1p", Treatment == "NO"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.6149 -0.7524  0.1080  0.6191  3.1936 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)   
(Intercept)            1.3527     0.5187   2.608  0.01218 * 
poly(Generation, 2)1   5.7573     1.8493   3.113  0.00315 **
poly(Generation, 2)2  -4.4217     1.8359  -2.408  0.02000 * 
Rep2                   0.2124     0.6976   0.304  0.76215   
Rep3                   0.9852     0.7056   1.396  0.16919   
Rep4                  -0.2271     0.6928  -0.328  0.74447   
Rep5                  -0.2271     0.6928  -0.328  0.74447   
Rep6                  -0.3333     0.6912  -0.482  0.63186   
Rep7                   0.6519     0.7056   0.924  0.36027   
Rep8                   0.6519     0.7056   0.924  0.36027   
Rep9                   0.5457     0.6976   0.782  0.43801   
Rep10                 -1.2124     0.6976  -1.738  0.08879 . 
speed_m1              -0.3186     0.1419  -2.246  0.02947 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.197 on 47 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.3434,    Adjusted R-squared:  0.1758 
F-statistic: 2.049 on 12 and 47 DF,  p-value: 0.04033
car::Anova(m2_0NO)
Anova Table (Type II tests)

Response: speed
                    Sum Sq Df F value   Pr(>F)   
poly(Generation, 2) 15.094  2  5.2656 0.008641 **
Rep                 18.723  9  1.4514 0.194196   
speed_m1             7.228  1  5.0431 0.029466 * 
Residuals           67.363 47                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m2_0YES <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
              data = filter(RIL_spread, Gap == "1p", Treatment == "YES", Rep != "1"))
summary(m2_0YES)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "1p", Treatment == "YES", Rep != "1"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.4322 -0.8911 -0.2425  0.8770  3.2000 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)           2.391e+00  6.566e-01   3.642 0.000737 ***
poly(Generation, 2)1 -2.207e+00  2.251e+00  -0.980 0.332469    
poly(Generation, 2)2 -4.364e-01  2.199e+00  -0.198 0.843638    
Rep3                 -8.006e-01  8.521e-01  -0.940 0.352776    
Rep4                 -1.134e+00  8.521e-01  -1.331 0.190420    
Rep5                 -1.467e+00  8.521e-01  -1.722 0.092418 .  
Rep6                 -1.694e-15  8.464e-01   0.000 1.000000    
Rep7                 -1.994e-01  8.521e-01  -0.234 0.816132    
Rep8                  5.343e-01  8.591e-01   0.622 0.537352    
Rep9                 -7.336e-01  8.478e-01  -0.865 0.391776    
Rep10                -1.067e+00  8.478e-01  -1.258 0.215167    
speed_m1             -2.009e-01  1.469e-01  -1.368 0.178626    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.466 on 42 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.2046,    Adjusted R-squared:  -0.003779 
F-statistic: 0.9819 on 11 and 42 DF,  p-value: 0.4775
car::Anova(m2_0YES)
Anova Table (Type II tests)

Response: speed
                    Sum Sq Df F value Pr(>F)
poly(Generation, 2)  3.847  2  0.8950 0.4163
Rep                 18.402  8  1.0703 0.4019
speed_m1             4.021  1  1.8711 0.1786
Residuals           90.269 42               
m2_0NO <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
             data = filter(RIL_spread, Gap == "2p", Treatment == "NO"))
summary(m2_0NO)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "2p", Treatment == "NO"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.5500 -1.0534 -0.1734  0.5902  4.3641 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)  
(Intercept)           9.251e-01  6.895e-01   1.342   0.1861  
poly(Generation, 2)1 -4.270e+00  2.527e+00  -1.690   0.0977 .
poly(Generation, 2)2  6.366e+00  2.470e+00   2.577   0.0132 *
Rep2                 -6.145e-01  9.541e-01  -0.644   0.5226  
Rep3                  1.844e+00  9.791e-01   1.883   0.0659 .
Rep4                  1.229e+00  9.635e-01   1.276   0.2084  
Rep5                 -1.145e-01  9.541e-01  -0.120   0.9050  
Rep6                  7.448e-16  9.509e-01   0.000   1.0000  
Rep7                  6.145e-01  9.541e-01   0.644   0.5226  
Rep8                  6.145e-01  9.541e-01   0.644   0.5226  
Rep9                  1.115e+00  9.541e-01   1.168   0.2486  
Rep10                 3.855e-01  9.541e-01   0.404   0.6880  
speed_m1             -2.290e-01  1.555e-01  -1.473   0.1474  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.647 on 47 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.2538,    Adjusted R-squared:  0.06328 
F-statistic: 1.332 on 12 and 47 DF,  p-value: 0.2331
car::Anova(m2_0NO)
Anova Table (Type II tests)

Response: speed
                     Sum Sq Df F value  Pr(>F)  
poly(Generation, 2)  18.231  2  3.3605 0.04324 *
Rep                  24.883  9  1.0192 0.43883  
speed_m1              5.885  1  2.1694 0.14745  
Residuals           127.489 47                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m2_0YES <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
              data = filter(RIL_spread, Gap == "2p", Treatment == "YES", Rep != "1"))
summary(m2_0YES)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "2p", Treatment == "YES", Rep != "1"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.5667 -1.4896 -0.9113  1.4641  4.6388 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)  
(Intercept)            2.1786     0.9285   2.346   0.0238 *
poly(Generation, 2)1  -1.1631     3.4858  -0.334   0.7403  
poly(Generation, 2)2  -3.2860     3.6111  -0.910   0.3680  
Rep3                  -0.2584     1.2902  -0.200   0.8422  
Rep4                  -1.1208     1.2815  -0.875   0.3868  
Rep5                  -0.3792     1.2815  -0.296   0.7687  
Rep6                  -0.3792     1.2815  -0.296   0.7687  
Rep7                  -1.1208     1.2815  -0.875   0.3868  
Rep8                   0.2416     1.2902   0.187   0.8524  
Rep9                  -1.0000     1.2786  -0.782   0.4385  
Rep10                 -0.3792     1.2815  -0.296   0.7687  
speed_m1              -0.2416     0.1731  -1.396   0.1701  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.215 on 42 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.1019,    Adjusted R-squared:  -0.1334 
F-statistic: 0.4331 on 11 and 42 DF,  p-value: 0.9321
car::Anova(m2_0YES)
Anova Table (Type II tests)

Response: speed
                     Sum Sq Df F value Pr(>F)
poly(Generation, 2)   9.878  2  1.0071 0.3739
Rep                  10.909  8  0.2781 0.9696
speed_m1              9.553  1  1.9479 0.1701
Residuals           205.971 42               
m2_0NO <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
             data = filter(RIL_spread, Gap == "3p", Treatment == "NO"))
summary(m2_0NO)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "3p", Treatment == "NO"))

Residuals:
    Min      1Q  Median      3Q     Max 
-4.4865 -0.7241 -0.0991  0.2443  3.1763 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)  
(Intercept)           1.595e+00  6.701e-01   2.381   0.0214 *
poly(Generation, 2)1  2.321e-01  2.426e+00   0.096   0.9242  
poly(Generation, 2)2  2.721e+00  2.405e+00   1.131   0.2636  
Rep2                 -6.928e-16  9.135e-01   0.000   1.0000  
Rep3                 -1.524e+00  9.207e-01  -1.655   0.1045  
Rep4                 -1.524e+00  9.207e-01  -1.655   0.1045  
Rep5                 -8.573e-01  9.207e-01  -0.931   0.3565  
Rep6                 -4.760e-01  9.207e-01  -0.517   0.6076  
Rep7                 -1.524e+00  9.207e-01  -1.655   0.1045  
Rep8                 -6.667e-01  9.135e-01  -0.730   0.4692  
Rep9                 -6.667e-01  9.135e-01  -0.730   0.4692  
Rep10                -1.333e+00  9.135e-01  -1.460   0.1511  
speed_m1             -2.860e-01  1.715e-01  -1.667   0.1021  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.582 on 47 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.1767,    Adjusted R-squared:  -0.03345 
F-statistic: 0.8409 on 12 and 47 DF,  p-value: 0.6093
car::Anova(m2_0NO)
Anova Table (Type II tests)

Response: speed
                     Sum Sq Df F value Pr(>F)
poly(Generation, 2)   5.157  2  1.0299 0.3649
Rep                  18.543  9  0.8229 0.5982
speed_m1              6.961  1  2.7805 0.1021
Residuals           117.671 47               
m2_0YES <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
              data = filter(RIL_spread, Gap == "3p", Treatment == "YES", Rep != "1"))
summary(m2_0YES)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "3p", Treatment == "YES", Rep != "1"))

Residuals:
    Min      1Q  Median      3Q     Max 
-3.3934 -1.1567 -0.0894  0.8961  6.0397 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)           1.543e+00  8.663e-01   1.781 0.082173 .  
poly(Generation, 2)1  4.948e+00  3.323e+00   1.489 0.143922    
poly(Generation, 2)2 -1.057e+01  3.209e+00  -3.293 0.002019 ** 
Rep3                  2.596e-15  1.189e+00   0.000 1.000000    
Rep4                  1.009e+00  1.193e+00   0.846 0.402187    
Rep5                  1.009e+00  1.193e+00   0.846 0.402187    
Rep6                 -1.009e+00  1.193e+00  -0.846 0.402187    
Rep7                  1.009e+00  1.193e+00   0.846 0.402187    
Rep8                  1.832e-15  1.189e+00   0.000 1.000000    
Rep9                 -1.009e+00  1.193e+00  -0.846 0.402187    
Rep10                -1.009e+00  1.193e+00  -0.846 0.402187    
speed_m1             -5.141e-01  1.322e-01  -3.888 0.000353 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.06 on 42 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.381, Adjusted R-squared:  0.2189 
F-statistic:  2.35 on 11 and 42 DF,  p-value: 0.02312
car::Anova(m2_0YES)
Anova Table (Type II tests)

Response: speed
                     Sum Sq Df F value    Pr(>F)    
poly(Generation, 2)  47.607  2  5.6079 0.0069405 ** 
Rep                  34.413  8  1.0134 0.4408425    
speed_m1             64.175  1 15.1191 0.0003532 ***
Residuals           178.275 42                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

It’s a very mixed bag here. In summary, polynomial time and negative autocorrelation are found in:

  • 0p, evo YES
  • 1p, evo NO
  • 3p, evo YES

I’m not sure how to interpret this…. Maybe it’s random artifacts???

17 May 2027

Updates

Data issues

I’ve had some email exchanges with Jenn about data—basically, I was mistaken in using the Excel files from the data directory. In particular, she said about the RIL data (on May 10):

The data that we used to make figure 2 came from this file (in the analysis folder):

seedlingsALL<-read.csv('2016_03_04_seedlingsALL_corrected.csv', header=T)

This data file does not include the Ler data. She said, in addition:

For the Exp 1 data, use the script 00_ArabiDD_ImportTransformData.R (in my github, I hope!) to read in the corrected files.

It is not in the Dropbox directory; rather it is at https://github.com/jennwilliamsubc/ArabiDD_analyses/blob/master/00_ArabiDD_ImportTransformData.R.

The file is displayed below; it does a lot more than import seedling data! It seems to pull in all the other ancillary data for Ler, so will be a good guide for that.

One thing I will need to ask about is the ancillary data for RILs.

Presentation at Bren

The Arabidopsis work was to have been the centerpiece of my presentation to the faculty & PhD students at Bren yesterday, but as it turned out I didn’t even get to that section. In the meeting with the students a few questions came up but they were generally clarifying questions—nothing about the actual results!

Jenn’s Ler script

Here it is, as found on github with a date of Aug 24, 2016:

##June 21, 2016

#Use this source file for all of the scripts for the Arabidopsis Density Project
#It doesn't need to be in MarkDown.

#Idea is to pull all data to here, and then to source these Data when you run the other files.

setwd("~/Dropbox/Arabidopsis/analysis")

#libraries used in the projects
#can you rewrite these as require?
library(plyr)
library(lattice)
library(Hmisc)

#import files that are siliques
gen0sil<-read.csv('2013_11_11_Exp1_Gen0_siliques.csv', header=T)
names(gen0sil)<-c("gen", "ID", "trt","gap_size","pot","siliques")
gen1sil<-read.csv("2014_04_15_Exp1_Gen1_siliques.csv", header=T)
names(gen1sil)<-c("gen", "ID", "trt","gap_size","pot","siliques")
gen2sil<-read.csv("2014_06_12_Exp1_Gen2_siliques.csv", header=T)
#note, I deleted the column called 'rep' from the csv. I'm not sure what it means, and the other data files don't have it
names(gen2sil)<-c("gen", "ID", "trt","gap_size","pot","siliques")
gen3sil<-read.csv("2014_11_27_Exp1_Gen3_siliques.csv", header=T)
names(gen3sil)<-c("gen", "ID", "trt","gap_size","pot","siliques")
gen4sil<-read.csv("2015_03_13_Exp1_Gen4_siliques.csv", header=T)
names(gen4sil)<-c("gen", "ID", "trt","gap_size","pot","siliques")
gen5sil<-read.csv("2015_08_25_Exp1_Gen5_siliques.csv", header=T)
names(gen5sil)<-c("gen", "ID", "trt","gap_size","pot","siliques")

allsil<-rbind(gen0sil, gen1sil, gen2sil, gen3sil, gen4sil, gen5sil)

#since we used the counts from trtA plants for trtB, we can only analyze variation from trtA plants
allsilA<-subset(allsil, trt=="A")

#we could also merge these data (very carefully!) with the runway data from the next generation
#that would allow us to say something about how silique number affects dispersal/spread in the two treatments

#HOWEVER, this will only include a subset of the data, because the silique numbers we have are not necessarily for the leading edge of the solitary plants (should be of the clipped, though!)

#I collected height data in October 2015
#I did not enter the data, and there are some funny things that I should have noticed earlier!
#for example, every record should have a chamber number, but those are missing
#do the raw data sheets exist anywhere?
gen5heights<-read.csv("2015_10_26_Exp1_Gen5_Height.csv", header=T)

########-----------
###import the side experiment data here (and anything else you think might be helpful)
disp_spray<-read.csv("2013_08_08_Exp1_Spray.csv", header=T)
#names(disp_spray)<-c("ecot","trt","mom_pot","pot","d","sdlgs","mom_sil")

#import Frankenstein data here (all data that are relevant to effects of density on height & siliques)
frank_traits<-read.csv("2016_06_24_LerTraitsxDensity_ALLexperiments.csv", header=T)
frank_traits_ring<-subset(frank_traits, exp=="ring")

testply<-ddply(frank_traits, .(ID), summarise, 
                meansil=mean(siliques, na.rm=T), 
                meanht=mean(height, na.rm=T)) 
meantraits<-join(testply, frank_traits, by="ID",type="left",match="first")
meantraits$totsil<-meantraits$meansil*meantraits$final_density

########-----------
###ALL SEEDLING DATA

seedlings_gen1<-read.csv("2014_02_18_Exp1_Gen1_seedposition.csv", header=T)
names(seedlings_gen1)<-c("ecot","ID","trt","Rep","gap_size","gen","pot","d","sdlgs")

seedlings_gen2<-read.csv("2014_06_12_Exp1_Gen2_seedposition.csv", header=T)
names(seedlings_gen2)<-c("ecot","ID","trt","Rep","gap_size","gen","pot","d","sdlgs")

seedlings_gen3<-read.csv("2014_11_11_Exp1_Gen3_seedposition.csv", header=T) #Gen3
names(seedlings_gen3)<-c("ecot","ID","trt","Rep","gap_size","gen","pot","d","sdlgs")

seedlings_gen4<-read.csv("2015_03_11_Exp1_Gen4_seedposition.csv", header=T) #Gen4
names(seedlings_gen4)<-c("ecot","ID","trt","Rep","gap_size","gen","pot","d","sdlgs")

seedlings_gen5<-read.csv("2015_08_11_Exp1_Gen5_seedposition.csv", header=T) #Gen5
names(seedlings_gen5)<-c("ecot","ID","trt","Rep","gap_size","gen","pot","d","sdlgs")

seedlings_gen6<-read.csv("2015_11_10_Exp1_Gen6_seedposition.csv", header=T) #Gen6
names(seedlings_gen6)<-c("ecot","ID","trt","Rep","gap_size","gen","pot","d","sdlgs")

seedlings_gen6<- seedlings_gen6[-140,] #drop ID 27 record of jumping 8 pots
seedlings_gen6<- seedlings_gen6[-c(163:169),] #drop ID 32 record of jumping 6 pots and being super dense

seedlingsALL<-rbind(seedlings_gen1,seedlings_gen2, seedlings_gen3, seedlings_gen4, seedlings_gen5, seedlings_gen6)

#drop runways with problems
# 35 & 60: a pot went missing (so it looks like the runway shrank)
# 45: different data on spreadsheet as on data sheets and can't figure it out. also, HUGE leap
seedlingsALL<-subset(seedlingsALL,!(ID %in% c(35, 60)))

############CALCULATE MAXD FOR EACH GENERATION AND MAKE FILE OF SPEEDS

#calculate max distance and number of plants/pot in all generations
maxd_ply<-ddply(seedlingsALL, .(ID,gen), summarise, 
                lastpot=max(pot, na.rm=T), 
                farseed=max(d,na.rm=T),
                lastsdlgs=sum(sdlgs[pot==max(pot)])) #this selects number of sdlgs in the last pot
maxd<-join(maxd_ply, seedlingsALL, by=c("ID","gen"),type="left",match="first")
maxd<-maxd[,1:9]

#now, move dense pots to the furthest right they could be
maxd$farseed<-ifelse(maxd$lastpot==0,7,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==2 & maxd$lastsdlgs>=10,22,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==3 & maxd$lastsdlgs>=10,29,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==4 & maxd$lastsdlgs>=10,36,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==5 & maxd$lastsdlgs>=10,44,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==6 & maxd$lastsdlgs>=10,51,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==7 & maxd$lastsdlgs>=10,58,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==8 & maxd$lastsdlgs>=10,66,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==9 & maxd$lastsdlgs>=10,73,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==10 & maxd$lastsdlgs>=10,80,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==11 & maxd$lastsdlgs>=10,88,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==12 & maxd$lastsdlgs>=10,95,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==13 & maxd$lastsdlgs>=10,102,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==16 & maxd$lastsdlgs>=10,124,maxd$farseed)
maxd$farseed <-ifelse(maxd$lastpot==17 & maxd$lastsdlgs>=10,132,maxd$farseed)

#rescale so pot 0 is at 0
maxd$farseed<-maxd$farseed - 7

#create some variables so we can look at speed from different time points.
#Goal: I want to be able to say something about distance traveled between gen. 1 and gen. 6 and between gen. 1 and gen. 5

speeds<-ddply(maxd, .(ID), summarise, 
              dist1to6 = farseed[6] - farseed[1],
              dist1to5 = farseed[5] - farseed[1],
              farseed6 = farseed[6],
              farseed5 = farseed[5],
              dist1to3 = farseed[3] - farseed[1])

speeds<-join(speeds, maxd, by=c("ID"),type="left",match="first")
speeds<-speeds[,c(1:6, 12:14)]

##ADD gap size as a continuous variable to the dataframe with Bruce's function.

# Function to add gap size as a continuous variable in units of mean dispersal distance
#   (1 pot is 2x mean dispersal) to a data frame. 

#The estimate of 1 pot = 2 x MDD is stil an estimate, but better than what we had before. 
#updata again, when you calculate it more precisely out the mean dispersal distance of Ler
# Argument must be a data frame with a column named "gap_size"
# Relies on the fact that the levels of the variable (1, 2, 3, 4) correspond to gap sizes
#   (0, 1, 2, 3) pots
add_gapnumber <- function(x) {
    x$gapnumber <- (as.numeric(x$gap_size) - 1) * 2
    return(x)
}

maxd<-add_gapnumber(maxd)
speeds<-add_gapnumber(speeds)

####*********************************************************************************
########HEIGHT DATA FROM GEN. 5
gen5heights<-read.csv("2015_10_26_Exp1_Gen5_Height.csv", header=T)

#a few more transformations and joining
maxd_gen5<-subset(maxd, gen==5)
maxd_gen5<-join(maxd_gen5, gen5heights, by=c("ID"),type="left",match="first")

maxd_gen6<-subset(maxd, gen==6)
#this might be dangerous (but I don't think I will be sorting the data)
maxd_gen5$farseedT1<-maxd_gen6$farseed
maxd_gen5$laspotT1<-maxd_gen6$lastpot
maxd_gen5$distTravel<-maxd_gen5$farseedT1-maxd_gen5$farseed


####*********************************************************************************
###CAN YOU COMBINE SILIQUES AT FRONT IN PREV. GEN WITH DISTANCE IN NEXT GEN?
##NOTE that you can only do this for treatments A and B

#here are the siliques for each ID in the furthest pot where they were counted
sil_ply<-ddply(allsil, .(ID,gen), summarise, 
                lastpot_sil=max(pot, na.rm=T), 
                lastsil= siliques[which.max(pot)])

sil_tomerge<-join(sil_ply, maxd, by=c("ID","gen"),type="left",match="first")
sil_tomerge<-sil_tomerge[,c(1:5)]

#now we need a test if these are actually the furthest pot.
#they don't match 16 times in total (these silique numbers now get NA)
sil_tomerge$lastsil<-ifelse(sil_tomerge$lastpot_sil==0, sil_tomerge$lastsil, 
              ifelse(sil_tomerge$lastpot_sil==sil_tomerge$lastpot,sil_tomerge$lastsil,NA))

sil_tomerge<-sil_tomerge[,c(1,2,4)]
sil_tomerge$gen<-sil_tomerge$gen+1

#maxd<-join(maxd, sil_tomerge, by=c("ID","gen"),type="left",match="first")

#but this is not quite what I want, because what I'd really like to know is how many cm did the invasion move between generations to run this analysis. It's not going to be that helpful to know what the very furthest seed is.

maxdT0<-maxd
names(maxdT0)<-sub("farseed","farseedT0",names(maxdT0))
maxdT0$gen<-maxdT0$gen+1
maxdT0<-maxdT0[,c(1,2,4)]

distbygen<-merge(maxd,maxdT0, by=c("ID","gen"),all.x=T)
distbygen$farseedT0<-ifelse(is.na(distbygen$farseedT0),0,distbygen$farseedT0)

distbygen$cm_travel=distbygen$farseed-distbygen$farseedT0
distbygen<-distbygen[,c(1,2,5,7,9,10,12)]

#NOW, MERGE WITH SILIQUES! (now from the current generation)
distbygen_sil<-join(sil_tomerge, distbygen, by=c("ID","gen"),type="left",match="first")

In the short run I can use the ALL SEEDLING DATA section to replace the data import part of popLer.R.

Data reboot

I’ve revised popLer.R and popRIL.R in the data directory to use Jenn’s cleaned up data. Note, however, that this loses generation 7 from both experiments. I have a query in to Jenn about this.

19 May 2027

Evolution data: Gen 7 = no

Jenn has this to say about generation 7 of the evolution experiment (email of 19 May):

Gen7 was the mystery one… it had tons of contaminants in the genetic data, seedlings went weird distances (strangely far and/or not at all), and that’s when Sara was really having a ton of trouble. We didn’t include Gen7 in the Science paper because of all these potential challenges with it (although did analyze it to make sure the results didn’t all just go away!). So I think also better to leave it out of your analyses.

Reboot Ler fecundity

Need to check if the data have changed!

Generation 1 fecundity

(Update of original analysis of 21 April 2017)

ggplot(Ler_seed_gen1, aes(x = eff_sd_no, fill = as.factor(Density))) + 
  geom_histogram(binwidth = 50)

kable(group_by(Ler_seed_gen1, Density) %>% 
        summarise(Mean = mean(eff_sd_no), Variance = var(eff_sd_no)),
      caption = paste("Mean and variance across pots of effective seed number",
                      "in treatments B and C of Ler generation 1")
      )
Mean and variance across pots of effective seed number in treatments B and C of Ler generation 1
Density Mean Variance
1 133.9 3443.433
50 347.6 21362.933

This is identical to the original analysis.

Time variation in fecundity

This is the part of the 24 April analysis that looked at treatment B:

Ler3pBC <- filter(popLer, Gap == "3p", Treatment != "A")
Ler3pBC <- group_by(Ler3pBC, Treatment, Pot, Rep) %>%
  mutate(Adults = lag(Seedlings)) 
Ler3pBC <- within(Ler3pBC,
                  {
                    Adults[Treatment == "B" & !is.na(Adults)] <- 1
                    Adults[Pot == 0 & Generation == 1] <- 1
                    Adults[Treatment == "C" & Pot == 0 & Generation == 1] <- 50
                  })
filter(Ler3pBC, Treatment == "B") %>%
  ggplot(aes(x = Gen, y = Seedlings)) + xlab("Generation") + geom_boxplot(notch = TRUE) 
notch went outside hinges. Try setting notch=FALSE.
notch went outside hinges. Try setting notch=FALSE.
notch went outside hinges. Try setting notch=FALSE.

By eye, this looks unchanged. Let’s try a log scale:

filter(Ler3pBC, Treatment == "B") %>%
  ggplot(aes(x = Gen, y = Seedlings)) + xlab("Generation") + geom_boxplot(notch = TRUE) +
  scale_y_log10()
notch went outside hinges. Try setting notch=FALSE.
notch went outside hinges. Try setting notch=FALSE.
notch went outside hinges. Try setting notch=FALSE.
notch went outside hinges. Try setting notch=FALSE.
notch went outside hinges. Try setting notch=FALSE.

That looks to mostly skew things in the other direction. Perhaps a square root transform?

filter(Ler3pBC, Treatment == "B") %>%
  ggplot(aes(x = Gen, y = Seedlings)) + xlab("Generation") + geom_boxplot(notch = TRUE) +
  scale_y_sqrt()
notch went outside hinges. Try setting notch=FALSE.
notch went outside hinges. Try setting notch=FALSE.
notch went outside hinges. Try setting notch=FALSE.
notch went outside hinges. Try setting notch=FALSE.

That’s nice. Now let’s do a linear model:

summary(lm(sqrt(Seedlings) ~ Gen, data = filter(Ler3pBC, Treatment == "B")))

Call:
lm(formula = sqrt(Seedlings) ~ Gen, data = filter(Ler3pBC, Treatment == 
    "B"))

Residuals:
    Min      1Q  Median      3Q     Max 
-6.1230 -2.2023 -0.0066  1.8917 11.5703 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   8.4447     0.9746   8.665  8.8e-13 ***
Gen2         -0.9075     1.3466  -0.674  0.50251    
Gen3         -1.6444     1.3466  -1.221  0.22601    
Gen4         -2.8167     1.2582  -2.239  0.02827 *  
Gen5         -1.4678     1.2582  -1.167  0.24724    
Gen6         -3.9232     1.2424  -3.158  0.00232 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.082 on 72 degrees of freedom
Multiple R-squared:  0.1576,    Adjusted R-squared:  0.0991 
F-statistic: 2.694 on 5 and 72 DF,  p-value: 0.02744

It actually looks like it might be linear!

summary(lm(sqrt(Seedlings) ~ Generation, data = filter(Ler3pBC, Treatment == "B")))

Call:
lm(formula = sqrt(Seedlings) ~ Generation, data = filter(Ler3pBC, 
    Treatment == "B"))

Residuals:
    Min      1Q  Median      3Q     Max 
-6.1911 -1.9276  0.1918  1.5063 12.8627 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   8.8859     0.8608  10.323 4.08e-16 ***
Generation   -0.6403     0.2074  -3.088  0.00282 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.081 on 76 degrees of freedom
Multiple R-squared:  0.1115,    Adjusted R-squared:  0.09977 
F-statistic: 9.533 on 1 and 76 DF,  p-value: 0.002816

This definitely seems like a more parsimonious model. However, it’s not clear how we should interpret it! Certainly it doesn’t accord with fluctuating effects of season or insect outbreaks. It could be that the team got better at inducing dispersal… one thing to look at would be to see whether there is a similar trend in the other landscapes, or a trend in the number of seeds jumping the gap.

OK, back to the reboot:

Density dependent fecundity

(also based on April 24)

qplot(data=Ler3pBC, x = Adults, y = Seedlings/Adults, 
      colour = Gen, log = "xy") + 
  geom_smooth(method = "lm")
Warning: Removed 14 rows containing non-finite values (stat_smooth).
Warning: Removed 14 rows containing missing values (geom_point).

Generation 6 looks really different.

It’s a bit less pronounced if we leave out treatment B, but it’s still there:

qplot(data=filter(Ler3pBC, Treatment == "C"), x = Adults, y = Seedlings/Adults, 
      colour = Gen, log = "xy") + 
  geom_smooth(method = "lm")
Warning: Removed 8 rows containing non-finite values (stat_smooth).
Warning: Removed 8 rows containing missing values (geom_point).

Is it just Gen 6, or is there still a linear trend in the intercept?

m1 <- lm(log(Seedlings/Adults) ~ Adults + Gen, data = Ler3pBC)
m2 <- lm(log(Seedlings/Adults) ~ Adults + Generation, data = Ler3pBC)
AIC(m1)
[1] 474.2151
AIC(m2)
[1] 481.8171
summary(m1)

Call:
lm(formula = log(Seedlings/Adults) ~ Adults + Gen, data = Ler3pBC)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.2616 -0.8728  0.0230  0.8158  3.6531 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.9266472  0.2859379  10.235   <2e-16 ***
Adults      -0.0052269  0.0003613 -14.468   <2e-16 ***
Gen2         0.0893939  0.4048630   0.221   0.8256    
Gen3         0.5141724  0.4000966   1.285   0.2010    
Gen4         0.7346576  0.4048698   1.815   0.0718 .  
Gen5         0.4482726  0.3697123   1.212   0.2274    
Gen6        -0.5019487  0.3731412  -1.345   0.1808    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.246 on 135 degrees of freedom
  (14 observations deleted due to missingness)
Multiple R-squared:  0.6429,    Adjusted R-squared:  0.627 
F-statistic: 40.51 on 6 and 135 DF,  p-value: < 2.2e-16
anova(m1)
Analysis of Variance Table

Response: log(Seedlings/Adults)
           Df Sum Sq Mean Sq  F value    Pr(>F)    
Adults      1 350.80  350.80 226.0342 < 2.2e-16 ***
Gen         5  26.39    5.28   3.4013  0.006314 ** 
Residuals 135 209.52    1.55                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(m2)

Call:
lm(formula = log(Seedlings/Adults) ~ Adults + Generation, data = Ler3pBC)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.8110 -0.9846  0.1907  0.9253  2.9576 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  3.3590649  0.2666127  12.599   <2e-16 ***
Adults      -0.0050648  0.0003674 -13.785   <2e-16 ***
Generation  -0.0722116  0.0652395  -1.107     0.27    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.297 on 139 degrees of freedom
  (14 observations deleted due to missingness)
Multiple R-squared:  0.6014,    Adjusted R-squared:  0.5957 
F-statistic: 104.9 on 2 and 139 DF,  p-value: < 2.2e-16

Here, if anything, it looks quadratic:

m3 <- lm(log(Seedlings/Adults) ~ Adults + poly(Generation,2), data = Ler3pBC)
AIC(m3)
[1] 471.2079
anova(m3)
Analysis of Variance Table

Response: log(Seedlings/Adults)
                     Df Sum Sq Mean Sq  F value    Pr(>F)    
Adults                1 350.80  350.80 226.2383 < 2.2e-16 ***
poly(Generation, 2)   2  21.93   10.97   7.0719  0.001192 ** 
Residuals           138 213.98    1.55                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(m3)

Call:
lm(formula = log(Seedlings/Adults) ~ Adults + poly(Generation, 
    2), data = Ler3pBC)

Residuals:
    Min      1Q  Median      3Q     Max 
-4.3838 -1.0034  0.0389  0.7699  3.5375 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)           3.1352719  0.1256288  24.957  < 2e-16 ***
Adults               -0.0052341  0.0003559 -14.708  < 2e-16 ***
poly(Generation, 2)1 -1.4438975  1.3083800  -1.104 0.271698    
poly(Generation, 2)2 -4.7130036  1.3165803  -3.580 0.000475 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.245 on 138 degrees of freedom
  (14 observations deleted due to missingness)
Multiple R-squared:  0.6353,    Adjusted R-squared:  0.6274 
F-statistic: 80.13 on 3 and 138 DF,  p-value: < 2.2e-16

Maybe. We can also try dropping Gen 6:

m4 <- lm(log(Seedlings/Adults) ~ Adults + Gen, data = filter(Ler3pBC, Generation < 6))
m5 <- lm(log(Seedlings/Adults) ~ Adults + poly(Generation, 2), 
         data = filter(Ler3pBC, Generation < 6))
AIC(m4)
[1] 342.6575
AIC(m5)
[1] 340.3917
summary(m4)

Call:
lm(formula = log(Seedlings/Adults) ~ Adults + Gen, data = filter(Ler3pBC, 
    Generation < 6))

Residuals:
     Min       1Q   Median       3Q      Max 
-2.12284 -0.85786  0.03988  0.68163  2.45417 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.9377619  0.2510228  11.703   <2e-16 ***
Adults      -0.0056859  0.0003939 -14.435   <2e-16 ***
Gen2         0.1190896  0.3556574   0.335   0.7384    
Gen3         0.5993128  0.3538179   1.694   0.0933 .  
Gen4         0.8720573  0.3621620   2.408   0.0178 *  
Gen5         0.5209411  0.3265875   1.595   0.1137    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.093 on 105 degrees of freedom
  (12 observations deleted due to missingness)
Multiple R-squared:  0.6716,    Adjusted R-squared:  0.656 
F-statistic: 42.95 on 5 and 105 DF,  p-value: < 2.2e-16
anova(m4)
Analysis of Variance Table

Response: log(Seedlings/Adults)
           Df  Sum Sq Mean Sq  F value Pr(>F)    
Adults      1 247.509 247.509 207.0324 <2e-16 ***
Gen         4   9.201   2.300   1.9241 0.1118    
Residuals 105 125.528   1.196                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(m5)

Call:
lm(formula = log(Seedlings/Adults) ~ Adults + poly(Generation, 
    2), data = filter(Ler3pBC, Generation < 6))

Residuals:
     Min       1Q   Median       3Q      Max 
-2.05718 -0.94013  0.09753  0.73707  2.36654 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)           3.3867835  0.1246611  27.168   <2e-16 ***
Adults               -0.0056051  0.0003878 -14.453   <2e-16 ***
poly(Generation, 2)1  2.4370676  1.1543458   2.111   0.0371 *  
poly(Generation, 2)2 -1.5904080  1.1492391  -1.384   0.1693    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.092 on 107 degrees of freedom
  (12 observations deleted due to missingness)
Multiple R-squared:  0.6664,    Adjusted R-squared:  0.6571 
F-statistic: 71.26 on 3 and 107 DF,  p-value: < 2.2e-16
anova(m5)
Analysis of Variance Table

Response: log(Seedlings/Adults)
                     Df  Sum Sq Mean Sq  F value  Pr(>F)    
Adults                1 247.509 247.509 207.7053 < 2e-16 ***
poly(Generation, 2)   2   7.224   3.612   3.0313 0.05241 .  
Residuals           107 127.505   1.192                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Yeah, that pretty much eliminates it all.

24 May 2027

Reboot Ler spread rates

(Rerun of analyses from 5 and 8 May)

Look at trends by rep:

ggplot(LerC_spread, aes(x = Generation, y = speed, color = Rep)) +
  geom_point(position = "jitter") + 
  geom_smooth(method = "lm", se = FALSE) + 
  facet_wrap(~Gap)

ggplot(LerC_spread, aes(x = Generation, y = speed)) +
  geom_point(position = "jitter") + 
  geom_smooth() + 
  facet_wrap(~Gap)

ggplot(LerC_spread, aes(x = Generation, y = speed)) +
  geom_point(position = "jitter") + 
  geom_smooth(method = "gam", method.args = list(k = 4)) + 
  facet_wrap(~Gap)

summary(lm(speed ~ Generation, data = filter(LerC_spread, Gap == "0p")))

Call:
lm(formula = speed ~ Generation, data = filter(LerC_spread, Gap == 
    "0p"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.4705 -1.3090  0.0295  1.1224  3.5295 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   3.2933     0.4609   7.145 1.67e-09 ***
Generation   -0.2743     0.1183  -2.318    0.024 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.566 on 58 degrees of freedom
Multiple R-squared:  0.08476,   Adjusted R-squared:  0.06898 
F-statistic: 5.371 on 1 and 58 DF,  p-value: 0.02402
library(mgcv)
summary(gam(speed ~ s(Generation, k = 4), data = filter(LerC_spread, Gap == "0p")))

Family: gaussian 
Link function: identity 

Formula:
speed ~ s(Generation, k = 4)

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   2.3333     0.1828   12.77   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                edf Ref.df     F  p-value    
s(Generation) 2.223  2.597 7.506 0.000784 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.239   Deviance explained = 26.7%
GCV = 2.1184  Scale est. = 2.0046    n = 60
plot(gam(speed ~ s(Generation, k = 4), data = filter(LerC_spread, Gap == "0p")))

anova(gam(speed ~ s(Generation, k = 4), data = filter(LerC_spread, Gap == "0p")), 
      gam(speed ~ Generation, data = filter(LerC_spread, Gap == "0p")),
      test = "Chisq") 
Analysis of Deviance Table

Model 1: speed ~ s(Generation, k = 4)
Model 2: speed ~ Generation
  Resid. Df Resid. Dev      Df Deviance  Pr(>Chi)    
1    56.403     113.82                               
2    58.000     142.17 -1.5975  -28.351 0.0004792 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This replicates the prior analysis exactly.

Now look at the autocorrelation and interaction analysis from 8 May:

m13 <- lm(speed ~ poly(Generation, 2) * Rep + speed_m1, 
          data = filter(LerC_spread, Gap == "0p"))
summary(m13)

Call:
lm(formula = speed ~ poly(Generation, 2) * Rep + speed_m1, data = filter(LerC_spread, 
    Gap == "0p"))

Residuals:
     Min       1Q   Median       3Q      Max 
-2.14668 -0.47051  0.02562  0.52439  2.46495 

Coefficients:
                             Estimate Std. Error t value Pr(>|t|)   
(Intercept)                 3.054e+00  9.512e-01   3.211   0.0046 **
poly(Generation, 2)1        1.990e+00  7.837e+00   0.254   0.8023   
poly(Generation, 2)2       -2.395e+00  7.431e+00  -0.322   0.7507   
Rep2                        4.945e-01  1.154e+00   0.429   0.6730   
Rep3                        1.295e-01  1.159e+00   0.112   0.9122   
Rep4                        2.280e+00  1.154e+00   1.976   0.0629 . 
Rep5                        6.689e-01  1.166e+00   0.574   0.5730   
Rep6                       -6.265e-01  1.150e+00  -0.545   0.5922   
Rep7                        1.540e-02  1.160e+00   0.013   0.9895   
Rep8                        1.467e+00  1.155e+00   1.270   0.2194   
Rep9                       -9.598e-01  1.150e+00  -0.835   0.4142   
Rep10                       8.141e-01  1.148e+00   0.709   0.4869   
speed_m1                   -4.215e-01  1.982e-01  -2.126   0.0468 * 
poly(Generation, 2)1:Rep2  -1.014e+01  1.115e+01  -0.910   0.3745   
poly(Generation, 2)2:Rep2  -5.019e+00  1.087e+01  -0.462   0.6495   
poly(Generation, 2)1:Rep3  -1.910e+01  1.109e+01  -1.722   0.1012   
poly(Generation, 2)2:Rep3   7.551e+00  1.057e+01   0.714   0.4838   
poly(Generation, 2)1:Rep4  -1.356e+01  1.122e+01  -1.208   0.2417   
poly(Generation, 2)2:Rep4   3.330e+00  1.155e+01   0.288   0.7762   
poly(Generation, 2)1:Rep5  -3.324e+00  1.110e+01  -0.299   0.7678   
poly(Generation, 2)2:Rep5  -8.064e+00  1.046e+01  -0.771   0.4502   
poly(Generation, 2)1:Rep6  -3.761e+00  1.109e+01  -0.339   0.7382   
poly(Generation, 2)2:Rep6   5.375e-14  1.044e+01   0.000   1.0000   
poly(Generation, 2)1:Rep7  -1.072e+01  1.111e+01  -0.965   0.3467   
poly(Generation, 2)2:Rep7  -3.057e+00  1.080e+01  -0.283   0.7801   
poly(Generation, 2)1:Rep8   5.176e+00  1.121e+01   0.462   0.6496   
poly(Generation, 2)2:Rep8  -7.049e+00  1.053e+01  -0.669   0.5114   
poly(Generation, 2)1:Rep9   2.805e+00  1.108e+01   0.253   0.8028   
poly(Generation, 2)2:Rep9  -1.315e+01  1.057e+01  -1.244   0.2287   
poly(Generation, 2)1:Rep10 -6.363e+00  1.115e+01  -0.571   0.5750   
poly(Generation, 2)2:Rep10 -1.054e+01  1.087e+01  -0.970   0.3444   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.43 on 19 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.7304,    Adjusted R-squared:  0.3048 
F-statistic: 1.716 on 30 and 19 DF,  p-value: 0.1102
car::Anova(m13)
Anova Table (Type II tests)

Response: speed
                        Sum Sq Df F value    Pr(>F)    
poly(Generation, 2)     46.959  2 11.4785 0.0005389 ***
Rep                     32.487  9  1.7646 0.1422649    
speed_m1                 9.249  1  4.5217 0.0467938 *  
poly(Generation, 2):Rep 29.613 18  0.8043 0.6761230    
Residuals               38.865 19                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Drop the interaction:

m14 <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
          data = filter(LerC_spread, Gap == "0p"))
summary(m14)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(LerC_spread, 
    Gap == "0p"))

Residuals:
     Min       1Q   Median       3Q      Max 
-2.89320 -0.68615 -0.06304  0.68486  2.49198 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)            2.9649     0.7018   4.225  0.00015 ***
poly(Generation, 2)1  -4.2553     2.4272  -1.753  0.08786 .  
poly(Generation, 2)2  -5.5072     2.3684  -2.325  0.02565 *  
Rep2                   0.2662     0.8609   0.309  0.75893    
Rep3                  -0.7985     0.8653  -0.923  0.36206    
Rep4                   1.5985     0.8653   1.847  0.07269 .  
Rep5                   0.7309     0.8738   0.836  0.40831    
Rep6                  -0.7323     0.8626  -0.849  0.40133    
Rep7                  -0.2662     0.8609  -0.309  0.75893    
Rep8                   1.7970     0.8797   2.043  0.04824 *  
Rep9                  -0.4000     0.8604  -0.465  0.64473    
Rep10                  0.8647     0.8690   0.995  0.32619    
speed_m1              -0.3309     0.1526  -2.169  0.03660 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.36 on 37 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.5251,    Adjusted R-squared:  0.371 
F-statistic: 3.409 on 12 and 37 DF,  p-value: 0.002013
car::Anova(m14)
Anova Table (Type II tests)

Response: speed
                    Sum Sq Df F value    Pr(>F)    
poly(Generation, 2) 46.959  2 12.6865 6.372e-05 ***
Rep                 32.487  9  1.9504   0.07456 .  
speed_m1             8.705  1  4.7036   0.03660 *  
Residuals           68.478 37                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

So, in conclusion: weak evidence for among-Rep differences in means, but strong evidence for negative autocorrelation. Note that this autocorrelation should act to slow the rate at which the variance in cumulative spread increases with time.

Let’s check out the other landscapes:

m24 <- lm(speed ~ Gen + Rep + speed_m1, 
          data = filter(LerC_spread, Gap == "1p"))
m34 <- lm(speed ~ Gen + Rep + speed_m1, 
          data = filter(LerC_spread, Gap == "2p"))
m44 <- lm(speed ~ Gen + Rep + speed_m1, 
          data = filter(LerC_spread, Gap == "3p"))
summary(m24)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(LerC_spread, 
    Gap == "1p"))

Residuals:
    Min      1Q  Median      3Q     Max 
-1.9393 -1.0203 -0.2807  0.7435  3.0813 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
(Intercept)  1.939e+00  8.046e-01   2.410   0.0213 *
Gen3        -6.966e-01  6.728e-01  -1.035   0.3076  
Gen4        -6.345e-01  6.671e-01  -0.951   0.3481  
Gen5        -2.000e-01  6.664e-01  -0.300   0.7658  
Gen6        -1.131e+00  6.692e-01  -1.690   0.0999 .
Rep2         1.379e-01  9.504e-01   0.145   0.8855  
Rep3        -1.100e-15  9.424e-01   0.000   1.0000  
Rep4        -1.008e-15  9.424e-01   0.000   1.0000  
Rep5         9.379e-01  9.504e-01   0.987   0.3305  
Rep6         4.000e-01  9.424e-01   0.424   0.6738  
Rep7        -3.310e-01  9.444e-01  -0.351   0.7280  
Rep8         4.690e-01  9.444e-01   0.497   0.6226  
Rep9         4.000e-01  9.424e-01   0.424   0.6738  
Rep10        1.007e+00  9.604e-01   1.048   0.3017  
speed_m1    -1.724e-01  1.543e-01  -1.117   0.2715  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.49 on 35 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.1932,    Adjusted R-squared:  -0.1295 
F-statistic: 0.5988 on 14 and 35 DF,  p-value: 0.8474
summary(m34)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(LerC_spread, 
    Gap == "2p"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.9139 -0.8927 -0.3292  0.9515  3.3861 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
(Intercept)  2.206e+00  8.573e-01   2.573   0.0145 *
Gen3        -3.000e-01  7.061e-01  -0.425   0.6736  
Gen4        -5.386e-02  7.076e-01  -0.076   0.9398  
Gen5         3.010e-16  7.061e-01   0.000   1.0000  
Gen6        -6.000e-01  7.061e-01  -0.850   0.4013  
Rep2         1.077e-01  1.003e+00   0.107   0.9151  
Rep3        -1.200e+00  9.986e-01  -1.202   0.2376  
Rep4        -1.200e+00  9.986e-01  -1.202   0.2376  
Rep5        -1.308e+00  1.003e+00  -1.304   0.2007  
Rep6        -1.308e+00  1.003e+00  -1.304   0.2007  
Rep7         7.077e-01  1.003e+00   0.706   0.4850  
Rep8        -1.200e+00  9.986e-01  -1.202   0.2376  
Rep9        -2.015e+00  1.015e+00  -1.985   0.0550 .
Rep10       -4.923e-01  1.003e+00  -0.491   0.6266  
speed_m1    -1.795e-01  1.524e-01  -1.178   0.2469  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.579 on 35 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.2666,    Adjusted R-squared:  -0.02679 
F-statistic: 0.9087 on 14 and 35 DF,  p-value: 0.558
summary(m44)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(LerC_spread, 
    Gap == "3p"))

Residuals:
    Min      1Q  Median      3Q     Max 
-1.9660 -0.6326 -0.5617  0.2950  3.4950 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)  
(Intercept) -3.660e-01  8.802e-01  -0.416   0.6804  
Gen3         7.092e-02  7.694e-01   0.092   0.9271  
Gen4         1.404e+00  7.694e-01   1.825   0.0776 .
Gen5         2.837e-01  8.417e-01   0.337   0.7384  
Gen6         7.092e-02  7.694e-01   0.092   0.9271  
Rep2         9.277e-01  1.038e+00   0.894   0.3782  
Rep3         9.277e-01  1.038e+00   0.894   0.3782  
Rep4         1.855e+00  1.073e+00   1.728   0.0939 .
Rep5         9.277e-01  1.038e+00   0.894   0.3782  
Rep6         9.277e-01  1.038e+00   0.894   0.3782  
Rep7         8.000e-01  1.025e+00   0.780   0.4412  
Rep9         9.277e-01  1.038e+00   0.894   0.3782  
Rep10       -8.426e-16  1.025e+00   0.000   1.0000  
speed_m1    -1.596e-01  1.983e-01  -0.805   0.4272  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.621 on 31 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.2256,    Adjusted R-squared:  -0.0991 
F-statistic: 0.6948 on 13 and 31 DF,  p-value: 0.7534
car::Anova(m24)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value Pr(>F)
Gen        7.794  4  0.8776 0.4873
Rep        7.984  9  0.3996 0.9268
speed_m1   2.772  1  1.2485 0.2715
Residuals 77.708 35               
car::Anova(m34)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value Pr(>F)
Gen        2.708  4  0.2715 0.8943
Rep       28.733  9  1.2805 0.2819
speed_m1   3.458  1  1.3869 0.2469
Residuals 87.262 35               
car::Anova(m44)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value Pr(>F)
Gen       12.533  4  1.1918 0.3339
Rep       10.906  8  0.5186 0.8332
speed_m1   1.702  1  0.6475 0.4272
Residuals 81.498 31               

Nothing to see here, folks!

The pattern of autocorrelation should, I think, fall into the category of something that we want the model to reproduce.

And fortunately, nothing has changed from the prior analysis.

Ler Cumulative spread

And finally double check the cumulative spread plots:

Let’s calculate the means and variances of cumulative spread:

cum_spread_stats <- group_by(LerC_spread, Gap, Gen, Generation) %>%
  summarise(mean_spread = mean(Furthest),
            var_spread = var(Furthest),
            CV_spread = sqrt(var_spread)/mean_spread
  )

And now plot the results:

ggplot(aes(x = Generation, y = mean_spread, color = Gap), data = cum_spread_stats) +
  geom_point() + geom_smooth(method = "lm")

ggplot(aes(x = Generation, y = var_spread, color = Gap), data = cum_spread_stats) +
  geom_point() + geom_line()

ggplot(aes(x = Generation, y = CV_spread, color = Gap), data = cum_spread_stats) +
  geom_point() + geom_line()
Warning: Removed 1 rows containing missing values (geom_point).
Warning: Removed 1 rows containing missing values (geom_path).

This looks just like the previous results—repeating my previous summary,

The linear approximation to mean spread is pretty good, although we see a dip in generation 6, as expected from the reduced seed production, and even without gen 6, the continuous runways seem to be decelerating. The ranking between landscapes makes sense. The variances are rather more over the map, although we don’t have confidence intervals on them. But exept for 1-pot gaps, we can squint and imagine that the variances are linear in time. Note, however, that with only 6 generations we’re not going to easily detect nonlinearities. I don’t know that the CVs tell us much.

It really does seem relevant to do the bootstrapping, to see whether the (rather nonsystematic) patterns in the cumulative variance vs. time are real.

Reboot spread analysis of evolution experiments

(From 10 May)

Let’s calculate the means and variances of cumulative spread:

cum_spread_stats <- group_by(RIL_spread, Treatment, Gap, Gen, Generation) %>%
  summarise(mean_spread = mean(Furthest),
            var_spread = var(Furthest),
            CV_spread = sqrt(var_spread)/mean_spread
  )

And now plot the results:

ggplot(aes(x = Generation, y = mean_spread, color = Treatment), data = cum_spread_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = var_spread, color = Treatment), data = cum_spread_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = CV_spread, color = Treatment), data = cum_spread_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)
Warning: Removed 1 rows containing missing values (geom_point).

The patterns in the mean are probably not overly distinguishable from linear, although I’d want to see CIs.

Let’s look at per-generation spread.

speed_stats <- group_by(RIL_spread, Treatment, Gap, Gen, Generation) %>%
  summarise(mean_spread = mean(speed),
            var_spread = var(speed),
            CV_spread = sqrt(var_spread)/mean_spread
  )

And now plot the results:

ggplot(aes(x = Generation, y = mean_spread, color = Treatment), data = speed_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = var_spread, color = Treatment), data = speed_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)

ggplot(aes(x = Generation, y = CV_spread, color = Treatment), data = speed_stats) +
  geom_point() + geom_line() + facet_wrap(~ Gap)
Warning: Removed 5 rows containing missing values (geom_point).

These plots look pretty similar (though not identical) to the second pass in the original analysis (after I dropped the problematic rep).

In 0p and 1p, the treatments look very similar. But I bet there will be differences in the autocorrelation structure. Let’s fit some models.

m1_0NO <- lm(speed ~ Gen + Rep + speed_m1, data = filter(RIL_spread, Gap == "0p", Treatment == "NO"))
summary(m1_0NO)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "0p", Treatment == "NO"))

Residuals:
     Min       1Q   Median       3Q      Max 
-1.67770 -0.77997 -0.03829  0.87593  1.87512 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.47770    0.62701   3.952 0.000359 ***
Gen3        -0.17399    0.50495  -0.345 0.732474    
Gen4        -0.44960    0.49971  -0.900 0.374424    
Gen5         0.10000    0.49871   0.201 0.842237    
Gen6         1.15121    0.50767   2.268 0.029631 *  
Rep2        -0.14960    0.70599  -0.212 0.833417    
Rep3         0.95282    0.73925   1.289 0.205887    
Rep4         0.10081    0.70812   0.142 0.887615    
Rep5         1.00161    0.71655   1.398 0.170964    
Rep6        -0.04879    0.71164  -0.069 0.945728    
Rep7         0.55121    0.71164   0.775 0.443804    
Rep8         1.05201    0.72281   1.455 0.154458    
Rep9         1.50242    0.73039   2.057 0.047199 *  
Rep10        1.55282    0.73925   2.101 0.042955 *  
speed_m1    -0.25201    0.15821  -1.593 0.120183    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.115 on 35 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.4056,    Adjusted R-squared:  0.1678 
F-statistic: 1.706 on 14 and 35 DF,  p-value: 0.09938
car::Anova(m1_0NO)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value  Pr(>F)  
Gen       14.656  4  2.9464 0.03367 *
Rep       16.263  9  1.4531 0.20406  
speed_m1   3.155  1  2.5372 0.12018  
Residuals 43.525 35                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m1_0YES <- lm(speed ~ Gen + Rep + speed_m1, data = filter(RIL_spread, Gap == "0p", Treatment == "YES"))
summary(m1_0YES)

Call:
lm(formula = speed ~ Gen + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "0p", Treatment == "YES"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.9829 -1.0494 -0.1159  1.3510  2.8446 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  3.883e+00  1.038e+00   3.743 0.000742 ***
Gen3         1.160e+00  8.272e-01   1.403 0.170584    
Gen4         6.541e-01  8.501e-01   0.769 0.447472    
Gen5         1.988e+00  8.278e-01   2.401 0.022530 *  
Gen6         3.329e-01  8.875e-01   0.375 0.710116    
Rep2        -8.888e-01  1.110e+00  -0.801 0.429366    
Rep4        -6.665e-01  1.114e+00  -0.598 0.553872    
Rep5        -8.885e-02  1.110e+00  -0.080 0.936718    
Rep6        -1.777e-01  1.111e+00  -0.160 0.874011    
Rep7        -5.554e-01  1.117e+00  -0.497 0.622542    
Rep8        -1.089e+00  1.110e+00  -0.981 0.334217    
Rep9         1.566e-15  1.110e+00   0.000 1.000000    
Rep10       -1.554e-01  1.117e+00  -0.139 0.890260    
speed_m1    -4.442e-01  1.610e-01  -2.759 0.009630 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.754 on 31 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.4034,    Adjusted R-squared:  0.1532 
F-statistic: 1.612 on 13 and 31 DF,  p-value: 0.135
car::Anova(m1_0YES)
Anova Table (Type II tests)

Response: speed
          Sum Sq Df F value  Pr(>F)   
Gen       21.292  4  1.7295 0.16864   
Rep        6.668  8  0.2708 0.97078   
speed_m1  23.436  1  7.6148 0.00963 **
Residuals 95.408 31                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This is getting different. Let’s try some models with continuous Generation:

m1_0NO <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
             data = filter(RIL_spread, Gap == "0p", Treatment == "NO"))
summary(m1_0NO)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "0p", Treatment == "NO"))

Residuals:
     Min       1Q   Median       3Q      Max 
-1.73288 -0.79389 -0.01207  0.84396  1.91342 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)           2.71691    0.55729   4.875 2.07e-05 ***
poly(Generation, 2)1  0.30080    1.89328   0.159   0.8746    
poly(Generation, 2)2  4.51283    1.78375   2.530   0.0158 *  
Rep2                 -0.15343    0.69108  -0.222   0.8255    
Rep3                  0.92600    0.72207   1.282   0.2077    
Rep4                  0.09314    0.69306   0.134   0.8938    
Rep5                  0.98628    0.70091   1.407   0.1677    
Rep6                 -0.06029    0.69634  -0.087   0.9315    
Rep7                  0.53971    0.69634   0.775   0.4432    
Rep8                  1.03286    0.70675   1.461   0.1523    
Rep9                  1.47943    0.71381   2.073   0.0452 *  
Rep10                 1.52600    0.72207   2.113   0.0414 *  
speed_m1             -0.23286    0.15103  -1.542   0.1316    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.092 on 37 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.3978,    Adjusted R-squared:  0.2025 
F-statistic: 2.037 on 12 and 37 DF,  p-value: 0.0486
car::Anova(m1_0NO)
Anova Table (Type II tests)

Response: speed
                    Sum Sq Df F value  Pr(>F)   
poly(Generation, 2) 14.088  2  5.9110 0.00592 **
Rep                 15.973  9  1.4893 0.18820   
speed_m1             2.833  1  2.3772 0.13163   
Residuals           44.093 37                   
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m1_0YES <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
              data = filter(RIL_spread, Gap == "0p", Treatment == "YES"))
summary(m1_0YES)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "0p", Treatment == "YES"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.8015 -1.1626 -0.3335  1.1958  2.9824 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)           4.591e+00  9.727e-01   4.720  4.2e-05 ***
poly(Generation, 2)1  5.606e+00  3.155e+00   1.777  0.08480 .  
poly(Generation, 2)2 -4.831e+00  2.924e+00  -1.652  0.10799    
Rep2                 -9.053e-01  1.129e+00  -0.802  0.42827    
Rep4                 -7.159e-01  1.132e+00  -0.632  0.53153    
Rep5                 -1.053e-01  1.129e+00  -0.093  0.92623    
Rep6                 -2.106e-01  1.130e+00  -0.186  0.85329    
Rep7                 -6.212e-01  1.135e+00  -0.547  0.58790    
Rep8                 -1.105e+00  1.129e+00  -0.979  0.33459    
Rep9                  8.513e-16  1.128e+00   0.000  1.00000    
Rep10                -2.212e-01  1.135e+00  -0.195  0.84668    
speed_m1             -5.266e-01  1.566e-01  -3.363  0.00196 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.784 on 33 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.3432,    Adjusted R-squared:  0.1242 
F-statistic: 1.567 on 11 and 33 DF,  p-value: 0.1549
car::Anova(m1_0YES)
Anova Table (Type II tests)

Response: speed
                     Sum Sq Df F value   Pr(>F)   
poly(Generation, 2)  11.668  2  1.8329 0.175855   
Rep                   6.826  8  0.2681 0.971930   
speed_m1             35.996  1 11.3097 0.001965 **
Residuals           105.032 33                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m1_0YES <- lm(speed ~ Generation + Rep + speed_m1, 
              data = filter(RIL_spread, Gap == "0p", Treatment == "YES"))
summary(m1_0YES)

Call:
lm(formula = speed ~ Generation + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "0p", Treatment == "YES"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.3642 -1.2625 -0.3097  1.4350  3.2240 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.297e+00  1.159e+00   3.708  0.00074 ***
Generation   1.923e-01  2.037e-01   0.944  0.35188    
Rep2        -9.097e-01  1.157e+00  -0.786  0.43718    
Rep4        -7.292e-01  1.161e+00  -0.628  0.53400    
Rep5        -1.097e-01  1.157e+00  -0.095  0.92500    
Rep6        -2.195e-01  1.158e+00  -0.189  0.85085    
Rep7        -6.390e-01  1.164e+00  -0.549  0.58654    
Rep8        -1.110e+00  1.157e+00  -0.959  0.34430    
Rep9         1.824e-16  1.157e+00   0.000  1.00000    
Rep10       -2.390e-01  1.164e+00  -0.205  0.83852    
speed_m1    -5.487e-01  1.599e-01  -3.431  0.00159 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.829 on 34 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.2889,    Adjusted R-squared:  0.07969 
F-statistic: 1.381 on 10 and 34 DF,  p-value: 0.2305
car::Anova(m1_0YES)
Anova Table (Type II tests)

Response: speed
            Sum Sq Df F value   Pr(>F)   
Generation   2.980  1  0.8909 0.351884   
Rep          6.878  8  0.2570 0.975431   
speed_m1    39.380  1 11.7738 0.001595 **
Residuals  113.720 34                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

So we get that the evolution treatment has negative autocorrelation but no time effect, whereas the no-evolution treatment has a quadratic time effect but no autocorrelation.

Let’s look at the rest of the landscapes.

m2_0NO <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
             data = filter(RIL_spread, Gap == "1p", Treatment == "NO"))
summary(m2_0NO)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "1p", Treatment == "NO"))

Residuals:
     Min       1Q   Median       3Q      Max 
-2.71489 -0.58274  0.01871  0.55243  2.84783 

Coefficients:
                      Estimate Std. Error t value Pr(>|t|)   
(Intercept)           1.267502   0.575288   2.203  0.03388 * 
poly(Generation, 2)1  5.782657   2.092024   2.764  0.00884 **
poly(Generation, 2)2  0.001457   2.059857   0.001  0.99944   
Rep2                  0.686016   0.773317   0.887  0.38075   
Rep3                  1.229024   0.785643   1.564  0.12625   
Rep4                  0.143008   0.765825   0.187  0.85289   
Rep5                  0.143008   0.765825   0.187  0.85289   
Rep6                 -0.256992   0.765825  -0.336  0.73909   
Rep7                  1.229024   0.785643   1.564  0.12625   
Rep8                  1.086016   0.773317   1.404  0.16855   
Rep9                  0.686016   0.773317   0.887  0.38075   
Rep10                -1.086016   0.773317  -1.404  0.16855   
speed_m1             -0.357520   0.154986  -2.307  0.02677 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.207 on 37 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.4012,    Adjusted R-squared:  0.207 
F-statistic: 2.066 on 12 and 37 DF,  p-value: 0.0454
car::Anova(m2_0NO)
Anova Table (Type II tests)

Response: speed
                    Sum Sq Df F value   Pr(>F)   
poly(Generation, 2) 18.403  2  6.3170 0.004363 **
Rep                 20.014  9  1.5267 0.174941   
speed_m1             7.751  1  5.3213 0.026767 * 
Residuals           53.895 37                    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m2_0YES <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
              data = filter(RIL_spread, Gap == "1p", Treatment == "YES"))
summary(m2_0YES)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "1p", Treatment == "YES"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.7791 -0.9350 -0.1574  0.8140  3.1582 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)  
(Intercept)           0.99237    0.69218   1.434   0.1601  
poly(Generation, 2)1 -1.26659    2.60077  -0.487   0.6291  
poly(Generation, 2)2  2.01957    2.48615   0.812   0.4218  
Rep2                  1.42446    0.97112   1.467   0.1509  
Rep3                  0.54964    0.95827   0.574   0.5697  
Rep4                  0.07482    0.95049   0.079   0.9377  
Rep5                  0.14964    0.95827   0.156   0.8768  
Rep6                  1.34964    0.95827   1.408   0.1674  
Rep7                  1.64891    1.03772   1.589   0.1206  
Rep8                  2.37409    1.01111   2.348   0.0243 *
Rep9                  0.87482    0.95049   0.920   0.3633  
Rep10                 0.54964    0.95827   0.574   0.5697  
speed_m1             -0.18705    0.17598  -1.063   0.2947  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.499 on 37 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.2339,    Adjusted R-squared:  -0.01459 
F-statistic: 0.9413 on 12 and 37 DF,  p-value: 0.5184
car::Anova(m2_0YES)
Anova Table (Type II tests)

Response: speed
                    Sum Sq Df F value Pr(>F)
poly(Generation, 2)  1.487  2  0.3311 0.7202
Rep                 22.920  9  1.1338 0.3646
speed_m1             2.537  1  1.1297 0.2947
Residuals           83.108 37               
m2_0NO <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
             data = filter(RIL_spread, Gap == "2p", Treatment == "NO"))
summary(m2_0NO)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "2p", Treatment == "NO"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.3582 -0.9472 -0.0689  0.4769  4.5809 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)  
(Intercept)            0.9925     0.7068   1.404   0.1686  
poly(Generation, 2)1  -6.3086     2.6583  -2.373   0.0229 *
poly(Generation, 2)2   4.4654     2.6698   1.673   0.1029  
Rep2                  -0.6000     0.9690  -0.619   0.5396  
Rep3                   1.5067     1.0179   1.480   0.1473  
Rep4                   1.4045     0.9910   1.417   0.1648  
Rep5                  -0.6000     0.9690  -0.619   0.5396  
Rep6                   0.1022     0.9745   0.105   0.9170  
Rep7                   0.7022     0.9745   0.721   0.4757  
Rep8                   0.8045     0.9910   0.812   0.4221  
Rep9                   0.8045     0.9910   0.812   0.4221  
Rep10                 -0.6000     0.9690  -0.619   0.5396  
speed_m1              -0.1704     0.1731  -0.984   0.3314  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.532 on 37 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.3146,    Adjusted R-squared:  0.09233 
F-statistic: 1.415 on 12 and 37 DF,  p-value: 0.2027
car::Anova(m2_0NO)
Anova Table (Type II tests)

Response: speed
                    Sum Sq Df F value  Pr(>F)  
poly(Generation, 2) 13.360  2  2.8457 0.07086 .
Rep                 24.258  9  1.1483 0.35548  
speed_m1             2.274  1  0.9687 0.33139  
Residuals           86.852 37                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m2_0YES <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
              data = filter(RIL_spread, Gap == "2p", Treatment == "YES"))
summary(m2_0YES)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "2p", Treatment == "YES"))

Residuals:
    Min      1Q  Median      3Q     Max 
-2.7756 -1.6122 -0.0111  1.1509  4.0250 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)  
(Intercept)           1.145e+00  9.297e-01   1.231   0.2268  
poly(Generation, 2)1  2.677e+00  3.742e+00   0.716   0.4793  
poly(Generation, 2)2 -9.610e+00  3.614e+00  -2.660   0.0120 *
Rep2                  5.398e-16  1.266e+00   0.000   1.0000  
Rep3                  8.006e-01  1.270e+00   0.630   0.5329  
Rep4                 -8.006e-01  1.270e+00  -0.630   0.5329  
Rep5                  8.006e-01  1.270e+00   0.630   0.5329  
Rep7                 -8.006e-01  1.270e+00  -0.630   0.5329  
Rep8                  1.601e+00  1.282e+00   1.249   0.2206  
Rep9                 -5.473e-16  1.266e+00   0.000   1.0000  
Rep10                 8.006e-01  1.270e+00   0.630   0.5329  
speed_m1             -3.343e-01  1.678e-01  -1.992   0.0547 .
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.002 on 33 degrees of freedom
  (9 observations deleted due to missingness)
Multiple R-squared:  0.3065,    Adjusted R-squared:  0.07535 
F-statistic: 1.326 on 11 and 33 DF,  p-value: 0.2543
car::Anova(m2_0YES)
Anova Table (Type II tests)

Response: speed
                     Sum Sq Df F value  Pr(>F)  
poly(Generation, 2)  38.775  2  4.8352 0.01440 *
Rep                  23.282  8  0.7258 0.66772  
speed_m1             15.911  1  3.9683 0.05469 .
Residuals           132.317 33                  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
m2_0NO <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
             data = filter(RIL_spread, Gap == "3p", Treatment == "NO"))
summary(m2_0NO)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "3p", Treatment == "NO"))

Residuals:
    Min      1Q  Median      3Q     Max 
-1.9572 -0.6343 -0.1966  0.2101  3.2495 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)
(Intercept)           9.520e-01  6.044e-01   1.575    0.124
poly(Generation, 2)1 -1.301e+00  2.274e+00  -0.572    0.571
poly(Generation, 2)2  2.980e+00  2.165e+00   1.376    0.177
Rep2                  1.606e-01  8.432e-01   0.190    0.850
Rep3                 -8.000e-01  8.297e-01  -0.964    0.341
Rep4                 -8.000e-01  8.297e-01  -0.964    0.341
Rep5                 -8.000e-01  8.297e-01  -0.964    0.341
Rep6                  9.606e-01  8.432e-01   1.139    0.262
Rep7                 -8.000e-01  8.297e-01  -0.964    0.341
Rep8                  1.606e-01  8.432e-01   0.190    0.850
Rep9                 -1.216e-16  8.297e-01   0.000    1.000
Rep10                -6.394e-01  8.432e-01  -0.758    0.453
speed_m1             -2.008e-01  1.879e-01  -1.068    0.292

Residual standard error: 1.312 on 37 degrees of freedom
  (10 observations deleted due to missingness)
Multiple R-squared:  0.2462,    Adjusted R-squared:  0.001712 
F-statistic: 1.007 on 12 and 37 DF,  p-value: 0.4621
car::Anova(m2_0NO)
Anova Table (Type II tests)

Response: speed
                    Sum Sq Df F value Pr(>F)
poly(Generation, 2)  3.518  2  1.0221 0.3698
Rep                 15.658  9  1.0109 0.4491
speed_m1             1.964  1  1.1411 0.2923
Residuals           63.682 37               
m2_0YES <- lm(speed ~ poly(Generation, 2) + Rep + speed_m1, 
              data = filter(RIL_spread, Gap == "3p", Treatment == "YES"))
summary(m2_0YES)

Call:
lm(formula = speed ~ poly(Generation, 2) + Rep + speed_m1, data = filter(RIL_spread, 
    Gap == "3p", Treatment == "YES"))

Residuals:
    Min      1Q  Median      3Q     Max 
-3.2641 -1.1870 -0.2878  1.0908  5.1321 

Coefficients:
                       Estimate Std. Error t value Pr(>|t|)    
(Intercept)           3.744e+00  9.449e-01   3.962 0.000443 ***
poly(Generation, 2)1  4.688e+00  3.527e+00   1.329 0.194157    
poly(Generation, 2)2 -4.498e+00  3.219e+00  -1.398 0.172836    
Rep2                 -1.944e+00  1.251e+00  -1.554 0.131072    
Rep3                 -1.372e+00  1.235e+00  -1.111 0.275663    
Rep4                 -1.373e-15  1.229e+00   0.000 1.000000    
Rep7                 -5.718e-01  1.235e+00  -0.463 0.646748    
Rep8                 -1.372e+00  1.235e+00  -1.111 0.275663    
Rep9                 -2.744e+00  1.251e+00  -2.193 0.036444 *  
Rep10                -2.744e+00  1.251e+00  -2.193 0.036444 *  
speed_m1             -7.147e-01  1.447e-01  -4.940    3e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.944 on 29 degrees of freedom
  (8 observations deleted due to missingness)
Multiple R-squared:  0.503, Adjusted R-squared:  0.3316 
F-statistic: 2.935 on 10 and 29 DF,  p-value: 0.01141
car::Anova(m2_0YES)
Anova Table (Type II tests)

Response: speed
                     Sum Sq Df F value    Pr(>F)    
poly(Generation, 2)   8.499  2  1.1251    0.3384    
Rep                  39.412  7  1.4906    0.2097    
speed_m1             92.175  1 24.4029 2.998e-05 ***
Residuals           109.539 29                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

For evolution YES, we have:

  • Autocorrelation in gaps 0, 2, 3
  • Gen effects in gap 2

For evolution NO, we have:

  • Autocorrelation in gaps 1
  • Gen effects in gaps 0, 1

Well, so what? This will require a bit of puzzling.

One thing to notice: the per-generation variance in spread is, on average, higher in the evolution treatment across all landscapes. However, this translates to a higher variance in cumulative variance only in landscapes 1p and 3p (and I’m guessing the CI on 3p is large). 1p is the landscape with no significant autocorrelation in the evolution treatment. However, 3p has a large negative AC.

Modeling

OK, here’s the part I’ve been dreading: reconstructing the models from almost 2 years ago…

Well, the first thing I found noodling around is some models of Ler DD fecundity in LerSims. But then I got distracted thinking about sources of stochasticity. First, we actually have estimates of demographic stochasticity in the RIL density experiments, as siliques are counted on multiple plants per pot. See fecundity_variability. I can’t remember whether any of the silique counts in the populations actually account for them by individual—I suspect not.

Second, there are two aspects of “environmental stochasticity.” The first is location of the pot within the greenhouse, which could affect both temperature/light and exposure to pests. In this case, we would expect nearby pots to have similar residuals. We don’t have a map of where pots were, but we do know that pots in the same runway were close to one another—thus we’d expect, within a given generation, less variability among pots in a runway than overall across all runways.

However, another source is inconsistency in the watering treatment, which affects both the total number of seeds released per silique (effective fecundity) as well as the dispersal kernel (in addition to creating heterogeneous disperpersal, this will create extra variabiity in the number of seeds staying in the home pot, which is our measure of fecundity in many cases). This could easily vary from pot to pot within a runway.

31 May 2027

Silique counts in the Ler population data

I’ve loaded in the silique count data from ~Arabidopsis/analysis/*_siliques.csv. I’m not actually sure how “clean” these are; I had to add the gap treatments to generation zero, and in the process found one error in generation 1; I didn’t look at the other generations.

The script is in data/popLer_sil.R and the data frame is popLer_sil.

Patterns of silique variation

Let’s look at silique production in treatment B as a function of replicate (here labeled “ID”) and generation. If my speculations last week hold, then ID:Gen should be significant, but probably not the main effect of ID.

In order to have enough replication to get at the interaction, I’ll look only at the continuous runways and not at generation zero.

sil_data <- subset(popLer_sil, Treatment == "B" & Generation > 0 & Gap == "0p")
summary(aov(Siliques ~ Gen * ID, data = sil_data)) 
             Df Sum Sq Mean Sq F value   Pr(>F)    
Gen           4 116793   29198  16.377 4.22e-12 ***
ID           10  65216    6522   3.658  0.00013 ***
Gen:ID       37 201238    5439   3.051 8.07e-08 ***
Residuals   288 513482    1783                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

So I’m not sure how to interpret the main effect of ID. Let’s look at the coefficients of a regression to see.

summary(lm(Siliques ~ Gen * ID, data = sil_data)) 

Call:
lm(formula = Siliques ~ Gen * ID, data = sil_data)

Residuals:
     Min       1Q   Median       3Q      Max 
-131.600  -21.557   -1.404   18.703  157.000 

Coefficients: (3 not defined because of singularities)
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  120.0000    29.8573   4.019 7.46e-05 ***
Gen2         -58.5000    36.5676  -1.600   0.1107    
Gen3         -53.1111    33.0085  -1.609   0.1087    
Gen4          -2.6923    32.0719  -0.084   0.9332    
Gen5          -6.8125    31.6685  -0.215   0.8298    
ID84          49.5000    42.2247   1.172   0.2420    
ID88         -38.0000    38.5457  -0.986   0.3250    
ID91          13.0000    42.2247   0.308   0.7584    
ID92         -86.1875    43.5242  -1.980   0.0486 *  
ID94         -22.5000    42.2247  -0.533   0.5945    
ID96         -40.0000    51.7144  -0.773   0.4399    
ID99         -14.0000    42.2247  -0.332   0.7405    
ID103        -18.7500    36.5676  -0.513   0.6085    
ID107         -3.3333    38.5457  -0.086   0.9311    
ID108         -5.0000    42.2247  -0.118   0.9058    
Gen2:ID84     11.7500    51.7144   0.227   0.8204    
Gen3:ID84    -46.0556    47.7303  -0.965   0.3354    
Gen4:ID84   -123.4744    47.0874  -2.622   0.0092 ** 
Gen5:ID84    -93.0875    45.5263  -2.045   0.0418 *  
Gen2:ID88     70.1000    47.8339   1.465   0.1439    
Gen3:ID88     34.1111    44.0292   0.775   0.4391    
Gen4:ID88      7.2637    43.3315   0.168   0.8670    
Gen5:ID88     81.4125    42.1366   1.932   0.0543 .  
Gen2:ID91      6.8333    50.2574   0.136   0.8919    
Gen3:ID91     -8.3333    46.6811  -0.179   0.8584    
Gen4:ID91    -48.5804    45.6306  -1.065   0.2879    
Gen5:ID91     -1.1875    45.1988  -0.026   0.9791    
Gen2:ID92          NA         NA      NA       NA    
Gen3:ID92          NA         NA      NA       NA    
Gen4:ID92     26.8798    61.7610   0.435   0.6637    
Gen5:ID92          NA         NA      NA       NA    
Gen2:ID94     38.6000    50.8452   0.759   0.4484    
Gen3:ID94      2.8111    48.3488   0.058   0.9537    
Gen4:ID94    -13.3077    47.0874  -0.283   0.7777    
Gen5:ID94    116.3125    45.7434   2.543   0.0115 *  
Gen2:ID96     59.8333    60.9461   0.982   0.3271    
Gen3:ID96     42.9111    56.8249   0.755   0.4508    
Gen4:ID96    -20.8077    55.7556  -0.373   0.7093    
Gen5:ID96     41.9554    55.1409   0.761   0.4474    
Gen2:ID99     52.5000    50.8452   1.033   0.3027    
Gen3:ID99      1.2361    46.9456   0.026   0.9790    
Gen4:ID99    -55.4327    46.2919  -1.197   0.2321    
Gen5:ID99    -35.9653    45.7434  -0.786   0.4324    
Gen2:ID103    59.0500    46.2548   1.277   0.2028    
Gen3:ID103    29.9722    41.6341   0.720   0.4722    
Gen4:ID103   -10.8077    40.2854  -0.268   0.7887    
Gen5:ID103   -23.5089    39.6986  -0.592   0.5542    
Gen2:ID107    38.8333    48.7568   0.796   0.4264    
Gen3:ID107    -0.7556    45.1714  -0.017   0.9867    
Gen4:ID107   -29.3744    44.4916  -0.660   0.5096    
Gen5:ID107   -40.8542    43.0338  -0.949   0.3432    
Gen2:ID108    51.6667    50.2574   1.028   0.3048    
Gen3:ID108    10.5556    46.6811   0.226   0.8213    
Gen4:ID108   -45.0769    45.3566  -0.994   0.3211    
Gen5:ID108    28.6125    44.8689   0.638   0.5242    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 42.22 on 288 degrees of freedom
Multiple R-squared:  0.4274,    Adjusted R-squared:  0.326 
F-statistic: 4.215 on 51 and 288 DF,  p-value: 3.016e-15
anova(lm(Siliques ~ Gen * ID, data = sil_data))
Analysis of Variance Table

Response: Siliques
           Df Sum Sq Mean Sq F value    Pr(>F)    
Gen         4 116793 29198.3 16.3766 4.222e-12 ***
ID         10  65216  6521.6  3.6578 0.0001295 ***
Gen:ID     37 201238  5438.9  3.0505 8.074e-08 ***
Residuals 288 513482  1782.9                      
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Oh, I see - aov still uses the default contrasts.

## Set orthogonal contrasts.
op <- options(contrasts = c("contr.helmert", "contr.poly"))
summary(aov(Siliques ~ Gen * ID, data = sil_data)) 
             Df Sum Sq Mean Sq F value   Pr(>F)    
Gen           4 116793   29198  16.377 4.22e-12 ***
ID           10  65216    6522   3.658  0.00013 ***
Gen:ID       37 201238    5439   3.051 8.07e-08 ***
Residuals   288 513482    1783                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
options(op)  # reset to previous

Nope, that didn’t do it either.

2 June 2017

Patterns of silique variation, continued

To get around the confusing results last time, let’s just create a new factor variable that is Gen:ID, so that each combo is treated independently.

sil_data <- subset(popLer_sil, Treatment == "B" & Generation > 0 & Gap == "0p")
sil_data$GenID <- with(sil_data, interaction(Gen, ID))
summary(aov(Siliques ~ Gen + GenID, data = sil_data)) 
             Df Sum Sq Mean Sq F value   Pr(>F)    
Gen           4 116793   29198   16.38 4.22e-12 ***
GenID        47 266454    5669    3.18 1.28e-09 ***
Residuals   288 513482    1783                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

OK, that’s better clarity. After accounting for inter-generational variation, most of the remaining variation is among runways rather than within runways.

ggplot(aes(Gen, Siliques, group = ID, color = ID), data = sil_data) + geom_point() +
  geom_smooth(aes(Generation, Siliques), span = 1, se = FALSE)
`geom_smooth()` using method = 'loess'

Let’s look at how the residual variance scales with the mean.

sil_stats <- group_by(sil_data, Gen, ID) %>%
  summarise(Mean = mean(Siliques),
            Var = var(Siliques),
            SD = sqrt(Var))
sil_stats <- sil_stats[complete.cases(sil_stats), ]
ggplot(aes(Mean, Var, color = ID), data = sil_stats) + geom_point() 

ggplot(aes(Mean, SD, color = ID), data = sil_stats) + geom_point() 

ggplot(aes(Mean, Var), data = sil_stats) + geom_point() + geom_smooth(method = "lm")

ggplot(aes(Mean, SD), data = sil_stats) + geom_point() + geom_smooth(method = "lm")

The relationship betwen the SD and mean looks a bit more linear, although there’s still a lot of scatter.

Now let’s see if the seed:silique ratio is structured in any way, or is independent among pots.

sil_data <- subset(popLer_sil, Treatment == "B" & Gap == "3p")
sil_data$Generation <- sil_data$Generation + 1
sil_data$Gen <- as.factor(sil_data$Generation)
popLer$ID <- as.factor(popLer$ID)
sil_seed_data <- left_join(sil_data, popLer)
Joining, by = c("Generation", "ID", "Treatment", "Gap", "Pot", "Gen")
Warning in left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y): joining
factors with different levels, coercing to character vector

Warning in left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y): joining
factors with different levels, coercing to character vector
Warning in left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y): joining
factor and character vector, coercing into character vector
ggplot(aes(Siliques, Seedlings, color = Gen), data = sil_seed_data) + 
  geom_point() + geom_smooth(method = "lm")
Warning: Removed 2 rows containing non-finite values (stat_smooth).
Warning: Removed 2 rows containing missing values (geom_point).

Within each generation, the number of seedlings in the home pot is independent of the number of siliques!!!

Now include the dispersing seeds in gen 1.

sil_data <- subset(popLer_sil, Treatment == "B" & Gap == "0p" & Gen == "0")
sil_data$Generation <- sil_data$Generation + 1
sil_data$Gen <- as.factor(sil_data$Generation)
Ler_seed_gen1$ID <- as.factor(Ler_seed_gen1$ID)
sil_seed_data <- left_join(sil_data, Ler_seed_gen1)
Joining, by = "ID"
Warning in left_join_impl(x, y, by$x, by$y, suffix$x, suffix$y): joining
factors with different levels, coercing to character vector
ggplot(aes(Siliques, eff_sd_no), data = sil_seed_data) + geom_point() + 
  geom_smooth(method = "lm")

ggplot(aes(Siliques, home), data = sil_seed_data) + geom_point() + 
  geom_smooth(method = "lm")

ggplot(aes(Siliques, stay_prop), data = sil_seed_data) + geom_point() + 
  geom_smooth(method = "lm")

So the lack of relationship between siliques and seedling is not because of variability in dispersal.

Let’s look at population structure in the variability of home pot seedlings, using the 1p gaps to get replication within populations but seedling numbers mostly from local production:

seed_data <- subset(popLer, Treatment == "B" & Generation > 1 & Gap == "1p")
seed_data$GenID <- with(seed_data, interaction(Gen, ID))
summary(aov(Seedlings ~ Gen + GenID, data = seed_data)) 
            Df Sum Sq Mean Sq F value   Pr(>F)    
Gen          4  31628    7907   5.747 0.000388 ***
GenID       45 178349    3963   2.881 1.41e-05 ***
Residuals   84 115567    1376                     
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ggplot(aes(Gen, Seedlings, group = ID, color = ID), data = seed_data) + geom_point() +
  geom_smooth(span = 1, se = FALSE)
`geom_smooth()` using method = 'loess'

Let’s look at how the residual variance scales with the mean.

sil_stats <- group_by(seed_data, Gen, ID) %>%
  summarise(Mean = mean(Seedlings),
            Var = var(Seedlings),
            SD = sqrt(Var))
sil_stats <- sil_stats[complete.cases(sil_stats), ]
ggplot(aes(Mean, Var, color = ID), data = sil_stats) + geom_point() 

ggplot(aes(Mean, SD, color = ID), data = sil_stats) + geom_point() 

ggplot(aes(Mean, Var), data = sil_stats) + geom_point() + geom_smooth(method = "lm")

ggplot(aes(Mean, SD), data = sil_stats) + geom_point() + geom_smooth(method = "lm")

So, despite the lack of relationship between siliques and seedlings, we do see that seedling production is structured by runway, and that the SD scales roughly linearly with the mean (althoug not in a way that bives a constant CV)

We can also look at population-level structure in the density-dependent populations.

seed_data <- group_by(popLer, ID, Pot) %>%
  mutate(Nm1 = lag(Seedlings))
#seed_data$Nm1 <- lag(popLer$Seedlings)
seed_data <- subset(seed_data, Treatment == "C" & Generation > 1 & Gap == "1p")
seed_data$GenID <- with(seed_data, interaction(Gen, ID))
anova(lm(log(Seedlings/Nm1) ~ log(Nm1) + Gen + GenID, data = seed_data)) 
Analysis of Variance Table

Response: log(Seedlings/Nm1)
           Df Sum Sq Mean Sq  F value    Pr(>F)    
log(Nm1)    1 359.27  359.27 750.8986 < 2.2e-16 ***
Gen         4  38.89    9.72  20.3221 1.105e-12 ***
GenID      45  33.35    0.74   1.5489    0.0328 *  
Residuals 115  55.02    0.48                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
ggplot(aes(log(Nm1), log(Seedlings/Nm1), group = GenID, color = Gen), data = seed_data) +
  geom_point() + geom_smooth(method = "lm", se = FALSE)
Warning: Removed 36 rows containing non-finite values (stat_smooth).
Warning: Removed 36 rows containing missing values (geom_point).

So the population-level variation is still here.

5 June 2017

More on stochasticity in Ler seed production

Some observations on stochasticity in Ler seed production:

  1. Seed number is uncorrelated with silique number, so the latter is not a useful measure
  2. Variability in seed production is hierarchically structured:
    1. Among generations (reflecting, perhaps, season or other greenhouse-wide effects)
    2. Among runways within a generation (reflecting, perhaps, spatial variation in greenhouse conditions; this is the best we can do without a map of runway locations)
    3. Among pots within a runway (this remains to be parsed between differences among pots and differences among individuals)

Some things that I still need to do:

  1. Re-analyze treatment B using log(seedling number) as the response variable
  2. Re-run the DD model using random effects for generation and rep within generation
  3. Regress residuals on \(1/N\); the intercept should give the amount of the among-pot variation that is due to “environmental stochasticity.” However, maybe this should be residuals from a model of \(\lambda\) rather than \(\log \lambda\)?

Let’s start by just adding treatment B into the density dependence analysis. In order to get a good sample size we’ll use both landscapes 1p and 2p.

seed_data <- group_by(popLer, ID, Pot) %>%
  mutate(Nm1 = 1 + (Treatment == "C") * (lag(Seedlings) - 1))
#seed_data$Nm1 <- lag(popLer$Seedlings)
seed_data <- subset(seed_data, 
                    Treatment %in% c("B", "C") & Generation > 1 & Gap %in% c("1p", "2p"))
seed_data$GenID <- with(seed_data, interaction(Gen, ID))
seed_data$Nm1[is.na(seed_data$Nm1)] <- 1
seed_data$ID <- as.factor(seed_data$ID) 
DD.lm <- lm(log(Seedlings/Nm1) ~ log(Nm1) + Gen * ID, data = seed_data) 
anova(DD.lm)
Analysis of Variance Table

Response: log(Seedlings/Nm1)
           Df  Sum Sq Mean Sq  F value    Pr(>F)    
log(Nm1)    1 1197.23 1197.23 792.6968 < 2.2e-16 ***
Gen         4   85.65   21.41  14.1779 9.618e-11 ***
ID         39   56.53    1.45   0.9597    0.5426    
Gen:ID    156  196.33    1.26   0.8333    0.9043    
Residuals 354  534.66    1.51                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(DD.lm)

Call:
lm(formula = log(Seedlings/Nm1) ~ log(Nm1) + Gen * ID, data = seed_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.4725 -0.4679  0.0000  0.6050  2.6645 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.78599    0.87309   2.046   0.0415 *  
log(Nm1)    -0.43660    0.03108 -14.049   <2e-16 ***
Gen3         1.30529    1.22910   1.062   0.2890    
Gen4         1.20304    1.12203   1.072   0.2844    
Gen5         0.15684    1.12384   0.140   0.8891    
Gen6         0.65412    1.06459   0.614   0.5393    
ID2          0.89140    1.50799   0.591   0.5548    
ID3          2.43909    1.50644   1.619   0.1063    
ID8          0.82874    1.12189   0.739   0.4606    
ID10         1.07306    1.22904   0.873   0.3832    
ID12         1.55370    1.06446   1.460   0.1453    
ID14         0.70985    1.22899   0.578   0.5639    
ID16        -0.03284    1.22898  -0.027   0.9787    
ID20         1.17331    1.12207   1.046   0.2964    
ID24         2.58161    1.50694   1.713   0.0876 .  
ID30         1.49876    1.50685   0.995   0.3206    
ID37         0.94599    1.22899   0.770   0.4420    
ID43         0.91479    1.02872   0.889   0.3745    
ID45         1.42381    1.50777   0.944   0.3457    
ID58         0.81804    1.12246   0.729   0.4666    
ID64         0.37098    1.22907   0.302   0.7630    
ID65         1.12223    1.50696   0.745   0.4569    
ID68         0.77756    1.12198   0.693   0.4887    
ID69         0.78757    1.22896   0.641   0.5220    
ID72         1.00936    1.06444   0.948   0.3436    
ID73         1.50985    1.50752   1.002   0.3172    
ID74         2.16526    1.50752   1.436   0.1518    
ID75         2.02068    1.50752   1.340   0.1810    
ID78         3.28919    1.50752   2.182   0.0298 *  
ID80         3.00980    1.50752   1.997   0.0466 *  
ID83         0.12833    1.23185   0.104   0.9171    
ID85         0.63815    1.12505   0.567   0.5709    
ID87         0.61191    1.50752   0.406   0.6851    
ID89         2.78873    1.50752   1.850   0.0652 .  
ID90         1.55632    1.23185   1.263   0.2073    
ID92         0.85246    1.12505   0.758   0.4491    
ID97         0.73894    1.23185   0.600   0.5490    
ID98         1.46316    1.23185   1.188   0.2357    
ID100       -0.03773    1.23185  -0.031   0.9756    
ID101       -0.90630    1.12505  -0.806   0.4210    
ID104        2.90279    1.23185   2.356   0.0190 *  
ID105        2.88684    1.50752   1.915   0.0563 .  
ID106        0.08285    1.23185   0.067   0.9464    
ID110        1.10439    1.50752   0.733   0.4643    
ID112        0.46555    1.12505   0.414   0.6793    
Gen3:ID2    -2.16031    1.94593  -1.110   0.2677    
Gen4:ID2    -1.65507    1.81315  -0.913   0.3620    
Gen5:ID2    -0.43102    1.78112  -0.242   0.8089    
Gen6:ID2    -1.32984    1.73986  -0.764   0.4452    
Gen3:ID3    -3.38632    1.94402  -1.742   0.0824 .  
Gen4:ID3    -1.98500    1.81076  -1.096   0.2737    
Gen5:ID3    -1.38528    1.77649  -0.780   0.4360    
Gen6:ID3    -2.98502    1.73805  -1.717   0.0868 .  
Gen3:ID8    -1.33328    1.58682  -0.840   0.4013    
Gen4:ID8    -1.11609    1.50610  -0.741   0.4592    
Gen5:ID8     0.74749    1.50533   0.497   0.6198    
Gen6:ID8    -2.72659    1.46619  -1.860   0.0638 .  
Gen3:ID10   -1.07110    1.73845  -0.616   0.5382    
Gen4:ID10   -1.78773    1.58675  -1.127   0.2606    
Gen5:ID10    1.28495    1.58657   0.810   0.4185    
Gen6:ID10   -1.29988    1.55062  -0.838   0.4024    
Gen3:ID12   -1.21135    1.50620  -0.804   0.4218    
Gen4:ID12   -1.90996    1.39370  -1.370   0.1714    
Gen5:ID12   -0.50411    1.37402  -0.367   0.7139    
Gen6:ID12   -2.52657    1.32983  -1.900   0.0583 .  
Gen3:ID14   -0.70204    1.73810  -0.404   0.6865    
Gen4:ID14   -0.07674    1.54663  -0.050   0.9605    
Gen5:ID14    0.50740    1.54640   0.328   0.7430    
Gen6:ID14   -1.09967    1.48022  -0.743   0.4580    
Gen3:ID16    0.74469    1.73801   0.428   0.6686    
Gen4:ID16   -0.28106    1.54655  -0.182   0.8559    
Gen5:ID16    1.21610    1.54660   0.786   0.4322    
Gen6:ID16    0.22965    1.50619   0.152   0.8789    
Gen3:ID20   -0.97470    1.54646  -0.630   0.5289    
Gen4:ID20   -1.13200    1.46463  -0.773   0.4401    
Gen5:ID20    0.36067    1.41910   0.254   0.7995    
Gen6:ID20   -1.78062    1.36232  -1.307   0.1920    
Gen3:ID24   -2.53815    2.12916  -1.192   0.2340    
Gen4:ID24   -3.53938    2.06919  -1.711   0.0880 .  
Gen5:ID24   -3.06753    1.88171  -1.630   0.1040    
Gen6:ID24   -1.83600    1.84533  -0.995   0.3204    
Gen3:ID30   -1.77556    2.12867  -0.834   0.4048    
Gen4:ID30   -3.66532    1.87884  -1.951   0.0519 .  
Gen5:ID30    0.13060    1.88262   0.069   0.9447    
Gen6:ID30   -0.96734    1.84359  -0.525   0.6001    
Gen3:ID37   -2.06866    1.73832  -1.190   0.2348    
Gen4:ID37   -0.97963    1.58658  -0.617   0.5373    
Gen5:ID37    0.30129    1.54681   0.195   0.8457    
Gen6:ID37   -1.37315    1.48010  -0.928   0.3542    
Gen3:ID43   -0.44298    1.45474  -0.305   0.7609    
Gen4:ID43   -0.70395    1.36878  -0.514   0.6074    
Gen5:ID43    0.30094    1.36712   0.220   0.8259    
Gen6:ID43   -2.30305    1.32457  -1.739   0.0830 .  
Gen3:ID45   -2.05370    1.88148  -1.092   0.2758    
Gen4:ID45   -0.45029    1.81065  -0.249   0.8037    
Gen5:ID45   -1.26464    1.77661  -0.712   0.4770    
Gen6:ID45   -2.32672    1.71811  -1.354   0.1765    
Gen3:ID58    0.12857    1.58711   0.081   0.9355    
Gen4:ID58   -0.88546    1.50992  -0.586   0.5580    
Gen5:ID58    0.49525    1.50735   0.329   0.7427    
Gen6:ID58   -1.54672    1.46931  -1.053   0.2932    
Gen3:ID64   -0.63984    1.66409  -0.385   0.7008    
Gen4:ID64   -0.58730    1.58711  -0.370   0.7116    
Gen5:ID64    1.15752    1.52230   0.760   0.4475    
Gen6:ID64   -0.78092    1.48146  -0.527   0.5984    
Gen3:ID65   -0.57220    2.12861  -0.269   0.7882    
Gen4:ID65   -1.71717    2.06898  -0.830   0.4071    
Gen5:ID65   -0.15346    2.06885  -0.074   0.9409    
Gen6:ID65   -1.07533    2.03807  -0.528   0.5981    
Gen3:ID68   -1.29369    1.58687  -0.815   0.4155    
Gen4:ID68   -0.91634    1.50632  -0.608   0.5434    
Gen5:ID68    0.01161    1.46276   0.008   0.9937    
Gen6:ID68   -1.04929    1.42049  -0.739   0.4606    
Gen3:ID69   -0.42327    1.66409  -0.254   0.7994    
Gen4:ID69   -1.54499    1.58753  -0.973   0.3311    
Gen5:ID69    1.04075    1.54669   0.673   0.5015    
Gen6:ID69   -1.03891    1.48042  -0.702   0.4833    
Gen3:ID72   -1.00478    1.46275  -0.687   0.4926    
Gen4:ID72   -1.22906    1.37521  -0.894   0.3721    
Gen5:ID72   -0.46668    1.36089  -0.343   0.7319    
Gen6:ID72   -2.07704    1.31545  -1.579   0.1152    
Gen3:ID73   -0.64989    2.12869  -0.305   0.7603    
Gen4:ID73   -1.60851    2.06872  -0.778   0.4374    
Gen5:ID73    2.13831    2.06970   1.033   0.3022    
Gen6:ID73   -1.03255    1.84359  -0.560   0.5758    
Gen3:ID74   -1.88924    2.12869  -0.888   0.3754    
Gen4:ID74   -1.28657    1.87735  -0.685   0.4936    
Gen5:ID74    0.39603    1.87843   0.211   0.8331    
Gen6:ID74   -1.88424    1.77401  -1.062   0.2889    
Gen3:ID75   -0.75525    2.12869  -0.355   0.7230    
Gen4:ID75   -0.67897    2.06872  -0.328   0.7429    
Gen5:ID75   -0.96777    2.06970  -0.468   0.6404    
Gen6:ID75   -2.32245    1.84359  -1.260   0.2086    
Gen3:ID78   -3.54725    2.12869  -1.666   0.0965 .  
Gen4:ID78   -3.88032    2.06872  -1.876   0.0615 .  
Gen5:ID78   -3.18899    1.81019  -1.762   0.0790 .  
Gen6:ID78   -2.15341    1.77401  -1.214   0.2256    
Gen3:ID80   -3.85118    1.94324  -1.982   0.0483 *  
Gen4:ID80   -1.92144    1.87735  -1.023   0.3068    
Gen5:ID80   -1.06104    1.81019  -0.586   0.5581    
Gen6:ID80   -1.60978    1.77401  -0.907   0.3648    
Gen3:ID83    0.66062    1.73810   0.380   0.7041    
Gen4:ID83    1.41472    1.66411   0.850   0.3958    
Gen5:ID83    1.00308    1.66533   0.602   0.5473    
Gen6:ID83   -0.16410    1.54659  -0.106   0.9156    
Gen3:ID85   -0.03199    1.58668  -0.020   0.9839    
Gen4:ID85    0.56258    1.50527   0.374   0.7088    
Gen5:ID85    0.33365    1.50662   0.221   0.8249    
Gen6:ID85   -0.01573    1.46295  -0.011   0.9914    
Gen3:ID87    0.01038    2.12869   0.005   0.9961    
Gen4:ID87   -0.65650    2.06872  -0.317   0.7512    
Gen5:ID87   -0.53321    1.87843  -0.284   0.7767    
Gen6:ID87   -1.26026    1.84359  -0.684   0.4947    
Gen3:ID89   -2.02986    2.12869  -0.954   0.3410    
Gen4:ID89   -1.92760    2.06872  -0.932   0.3521    
Gen5:ID89   -1.78711    2.06970  -0.863   0.3885    
Gen6:ID89   -2.18431    2.03814  -1.072   0.2846    
Gen3:ID90   -1.08191    1.66412  -0.650   0.5160    
Gen4:ID90   -1.99026    1.54651  -1.287   0.1990    
Gen5:ID90    0.58267    1.52323   0.383   0.7023    
Gen6:ID90   -1.52391    1.46295  -1.042   0.2983    
Gen3:ID92   -0.59698    1.58668  -0.376   0.7070    
Gen4:ID92   -0.41567    1.50527  -0.276   0.7826    
Gen5:ID92    2.20921    1.46425   1.509   0.1323    
Gen6:ID92   -0.78100    1.39243  -0.561   0.5752    
Gen3:ID97   -0.52622    1.73810  -0.303   0.7623    
Gen4:ID97    0.43335    1.66411   0.260   0.7947    
Gen5:ID97    0.94320    1.66533   0.566   0.5715    
Gen6:ID97    0.27881    1.62594   0.171   0.8639    
Gen3:ID98   -0.90303    1.66412  -0.543   0.5877    
Gen4:ID98   -1.95525    1.54651  -1.264   0.2070    
Gen5:ID98   -1.10618    1.54782  -0.715   0.4753    
Gen6:ID98   -1.95746    1.50536  -1.300   0.1943    
Gen3:ID100   0.84551    1.73810   0.486   0.6269    
Gen4:ID100   0.64140    1.66411   0.385   0.7002    
Gen5:ID100   1.58727    1.66533   0.953   0.3412    
Gen6:ID100   0.69073    1.62594   0.425   0.6712    
Gen3:ID101   1.86031    1.58668   1.172   0.2418    
Gen4:ID101   1.93286    1.50527   1.284   0.2000    
Gen5:ID101   2.29557    1.50662   1.524   0.1285    
Gen6:ID101   1.82846    1.46295   1.250   0.2122    
Gen3:ID104  -1.85765    1.73810  -1.069   0.2859    
Gen4:ID104  -3.17278    1.66411  -1.907   0.0574 .  
Gen5:ID104  -1.99215    1.54782  -1.287   0.1989    
Gen6:ID104  -2.98978    1.50536  -1.986   0.0478 *  
Gen3:ID105  -1.02229    2.12869  -0.480   0.6313    
Gen4:ID105  -2.09168    2.06872  -1.011   0.3127    
Gen5:ID105  -0.34103    2.06970  -0.165   0.8692    
Gen6:ID105  -0.98315    2.03814  -0.482   0.6298    
Gen3:ID106  -0.41737    1.66412  -0.251   0.8021    
Gen4:ID106   0.54158    1.58668   0.341   0.7331    
Gen5:ID106   1.44681    1.54782   0.935   0.3506    
Gen6:ID106  -0.34875    1.50536  -0.232   0.8169    
Gen3:ID110  -0.66930    2.12869  -0.314   0.7534    
Gen4:ID110  -0.61371    1.87735  -0.327   0.7439    
Gen5:ID110   0.68361    1.87843   0.364   0.7161    
Gen6:ID110  -0.87094    1.84359  -0.472   0.6369    
Gen3:ID112  -0.09026    1.58668  -0.057   0.9547    
Gen4:ID112  -0.06853    1.50527  -0.046   0.9637    
Gen5:ID112   1.11058    1.46425   0.758   0.4487    
Gen6:ID112  -0.39053    1.41929  -0.275   0.7834    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.229 on 354 degrees of freedom
Multiple R-squared:  0.7418,    Adjusted R-squared:  0.5959 
F-statistic: 5.084 on 200 and 354 DF,  p-value: < 2.2e-16
ggplot(aes(log(Nm1), log(Seedlings/Nm1), group = GenID, color = Gen), data = seed_data) +
  geom_point() + geom_smooth(method = "lm", se = FALSE)

All right, now let’s try with glm and quasi-poisson:

DD.glm <- glm(Seedlings ~ log(Nm1) + Gen * ID, data = seed_data, family = quasipoisson) 
car::Anova(DD.glm) 
Analysis of Deviance Table (Type II tests)

Response: Seedlings
         LR Chisq  Df Pr(>Chisq)    
log(Nm1)   431.29   1  < 2.2e-16 ***
Gen        192.69   4  < 2.2e-16 ***
ID          78.50  39  0.0001811 ***
Gen:ID     223.80 156  0.0003043 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(DD.glm)

Call:
glm(formula = Seedlings ~ log(Nm1) + Gen * ID, family = quasipoisson, 
    data = seed_data)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-25.346   -5.058    0.000    2.938   31.110  

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.043179   0.451232   8.960  < 2e-16 ***
log(Nm1)     0.320337   0.017984  17.812  < 2e-16 ***
Gen3         0.508582   0.544455   0.934 0.350883    
Gen4         0.553890   0.509055   1.088 0.277302    
Gen5        -0.190163   0.539657  -0.352 0.724765    
Gen6        -0.182980   0.540954  -0.338 0.735373    
ID2          0.017309   0.637472   0.027 0.978353    
ID3          1.328772   0.523153   2.540 0.011514 *  
ID8          0.247817   0.550162   0.450 0.652666    
ID10         0.114426   0.648556   0.176 0.860056    
ID12         0.318970   0.538649   0.592 0.554118    
ID14        -0.098951   0.618344  -0.160 0.872953    
ID16         0.072229   0.597492   0.121 0.903850    
ID20        -0.183721   0.628544  -0.292 0.770231    
ID24         1.557990   0.501370   3.107 0.002040 ** 
ID30         0.461067   0.602790   0.765 0.444848    
ID37         0.350891   0.598379   0.586 0.557979    
ID43         0.396803   0.528499   0.751 0.453265    
ID45         0.520991   0.573021   0.909 0.363863    
ID58         0.065428   0.623718   0.105 0.916514    
ID64         0.419852   0.607499   0.691 0.489946    
ID65         0.101626   0.655837   0.155 0.876944    
ID68        -0.108707   0.616558  -0.176 0.860150    
ID69        -0.160725   0.664430  -0.242 0.808998    
ID72         0.275853   0.537597   0.513 0.608186    
ID73        -0.747343   1.728171  -0.432 0.665680    
ID74        -0.091936   1.283982  -0.072 0.942959    
ID75        -0.236517   1.368718  -0.173 0.862906    
ID78         1.031994   0.820509   1.258 0.209312    
ID80         0.752611   0.908076   0.829 0.407778    
ID83        -1.517451   1.791424  -0.847 0.397532    
ID85        -0.251443   0.876600  -0.287 0.774404    
ID87        -1.645284   2.652267  -0.620 0.535439    
ID89         0.531531   0.989065   0.537 0.591323    
ID90         0.369619   0.811782   0.455 0.649160    
ID92        -0.379618   0.919689  -0.413 0.680027    
ID97        -0.728993   1.252908  -0.582 0.561044    
ID98        -0.225467   1.014555  -0.222 0.824261    
ID100       -2.097269   2.360245  -0.889 0.374832    
ID101       -2.369203   2.213498  -1.070 0.285194    
ID104        0.812749   0.704261   1.154 0.249260    
ID105        0.629649   0.951763   0.662 0.508683    
ID106       -0.975127   1.396775  -0.698 0.485556    
ID110       -1.152808   2.092381  -0.551 0.582012    
ID112       -1.014657   1.189767  -0.853 0.394335    
Gen3:ID2    -0.725818   0.847518  -0.856 0.392353    
Gen4:ID2     0.188788   0.733663   0.257 0.797080    
Gen5:ID2     0.048529   0.777082   0.062 0.950240    
Gen6:ID2    -0.418015   0.800906  -0.522 0.602047    
Gen3:ID3    -1.626290   0.692556  -2.348 0.019412 *  
Gen4:ID3    -1.126329   0.626939  -1.797 0.073259 .  
Gen5:ID3    -0.842500   0.648192  -1.300 0.194526    
Gen6:ID3    -1.350227   0.662340  -2.039 0.042236 *  
Gen3:ID8    -0.366896   0.682695  -0.537 0.591314    
Gen4:ID8    -0.934527   0.673860  -1.387 0.166367    
Gen5:ID8     0.763515   0.655545   1.165 0.244923    
Gen6:ID8    -1.804249   0.837205  -2.155 0.031829 *  
Gen3:ID10   -0.419608   0.803372  -0.522 0.601782    
Gen4:ID10   -0.094649   0.735702  -0.129 0.897706    
Gen5:ID10    1.060966   0.737561   1.438 0.151182    
Gen6:ID10    0.020512   0.755992   0.027 0.978369    
Gen3:ID12   -0.378844   0.658266  -0.576 0.565307    
Gen4:ID12   -0.958024   0.635016  -1.509 0.132277    
Gen5:ID12    0.357942   0.641027   0.558 0.576933    
Gen6:ID12   -1.262101   0.701813  -1.798 0.072974 .  
Gen3:ID14    0.121812   0.754102   0.162 0.871766    
Gen4:ID14   -0.512795   0.736078  -0.697 0.486474    
Gen5:ID14    0.780078   0.717626   1.087 0.277765    
Gen6:ID14   -0.045252   0.735141  -0.062 0.950952    
Gen3:ID16   -0.061494   0.741496  -0.083 0.933952    
Gen4:ID16   -1.202151   0.759309  -1.583 0.114265    
Gen5:ID16    0.342530   0.717455   0.477 0.633355    
Gen6:ID16    0.100265   0.712884   0.141 0.888229    
Gen3:ID20   -0.151832   0.760273  -0.200 0.841824    
Gen4:ID20    0.257355   0.700666   0.367 0.713614    
Gen5:ID20    1.222094   0.712657   1.715 0.087249 .  
Gen6:ID20    0.079012   0.729305   0.108 0.913788    
Gen3:ID24   -1.218430   0.638428  -1.908 0.057137 .  
Gen4:ID24   -2.372220   0.695935  -3.409 0.000728 ***
Gen5:ID24   -1.147753   0.690829  -1.661 0.097516 .  
Gen6:ID24   -1.393997   0.709023  -1.966 0.050070 .  
Gen3:ID30   -0.712891   0.779117  -0.915 0.360815    
Gen4:ID30   -2.012935   0.922025  -2.183 0.029680 *  
Gen5:ID30    0.041165   0.783026   0.053 0.958103    
Gen6:ID30   -0.181481   0.756519  -0.240 0.810554    
Gen3:ID37   -1.148642   0.807422  -1.423 0.155732    
Gen4:ID37   -0.211171   0.697251  -0.303 0.762172    
Gen5:ID37    0.222909   0.709411   0.314 0.753541    
Gen6:ID37   -0.468397   0.726377  -0.645 0.519449    
Gen3:ID43   -0.394548   0.645279  -0.611 0.541301    
Gen4:ID43   -0.235858   0.602237  -0.392 0.695563    
Gen5:ID43    0.460569   0.625507   0.736 0.462028    
Gen6:ID43   -1.422365   0.692022  -2.055 0.040576 *  
Gen3:ID45   -1.286955   0.777725  -1.655 0.098857 .  
Gen4:ID45   -0.415576   0.672081  -0.618 0.536748    
Gen5:ID45   -0.282137   0.693140  -0.407 0.684223    
Gen6:ID45   -1.031474   0.734838  -1.404 0.161292    
Gen3:ID58    0.312379   0.732627   0.426 0.670088    
Gen4:ID58   -0.298223   0.705544  -0.423 0.672782    
Gen5:ID58    0.897457   0.714474   1.256 0.209905    
Gen6:ID58   -0.269180   0.748447  -0.360 0.719322    
Gen3:ID64   -0.025388   0.725404  -0.035 0.972101    
Gen4:ID64   -1.448796   0.752154  -1.926 0.054880 .  
Gen5:ID64   -0.005864   0.723116  -0.008 0.993534    
Gen6:ID64   -0.960080   0.750582  -1.279 0.201695    
Gen3:ID65    0.391479   0.778362   0.503 0.615310    
Gen4:ID65   -0.628793   0.796096  -0.790 0.430147    
Gen5:ID65    0.483003   0.801865   0.602 0.547327    
Gen6:ID65    0.092653   0.831419   0.111 0.911331    
Gen3:ID68   -0.650878   0.787310  -0.827 0.408957    
Gen4:ID68   -0.517148   0.735852  -0.703 0.482650    
Gen5:ID68    0.511472   0.728950   0.702 0.483355    
Gen6:ID68   -0.089564   0.748860  -0.120 0.904867    
Gen3:ID69    0.678626   0.769246   0.882 0.378269    
Gen4:ID69   -1.151739   0.812405  -1.418 0.157161    
Gen5:ID69    0.913132   0.763121   1.197 0.232273    
Gen6:ID69    0.041416   0.775749   0.053 0.957452    
Gen3:ID72   -0.107015   0.650276  -0.165 0.869378    
Gen4:ID72   -1.128138   0.636728  -1.772 0.077292 .  
Gen5:ID72    0.059203   0.644784   0.092 0.926895    
Gen6:ID72   -1.197553   0.694305  -1.725 0.085432 .  
Gen3:ID73    0.146825   2.127063   0.069 0.945007    
Gen4:ID73   -0.959355   2.686363  -0.357 0.721214    
Gen5:ID73    2.485313   1.831548   1.357 0.175663    
Gen6:ID73    1.347287   1.873472   0.719 0.472530    
Gen3:ID74   -1.092530   2.081458  -0.525 0.599990    
Gen4:ID74   -0.462082   1.537290  -0.301 0.763909    
Gen5:ID74    0.744269   1.466755   0.507 0.612172    
Gen6:ID74   -0.412529   1.615968  -0.255 0.798653    
Gen3:ID75    0.041464   1.711591   0.024 0.980686    
Gen4:ID75   -0.029819   1.708098  -0.017 0.986081    
Gen5:ID75   -0.620767   2.391236  -0.260 0.795322    
Gen6:ID75   -1.225788   2.319023  -0.529 0.597428    
Gen3:ID78   -2.750542   2.277292  -1.208 0.227926    
Gen4:ID78   -3.231168   2.749486  -1.175 0.240709    
Gen5:ID78   -1.872749   1.411594  -1.327 0.185466    
Gen6:ID78   -1.308675   1.207476  -1.084 0.279186    
Gen3:ID80   -2.147372   1.586228  -1.354 0.176677    
Gen4:ID80   -1.272143   1.231628  -1.033 0.302358    
Gen5:ID80    0.145085   1.062433   0.137 0.891457    
Gen6:ID80   -0.720991   1.193641  -0.604 0.546213    
Gen3:ID83    0.847253   2.019075   0.420 0.675014    
Gen4:ID83    1.695294   1.892558   0.896 0.370985    
Gen5:ID83    0.755477   2.237085   0.338 0.735785    
Gen6:ID83    0.977479   2.050248   0.477 0.633826    
Gen3:ID85   -0.539119   1.201553  -0.449 0.653934    
Gen4:ID85   -0.111520   1.089472  -0.102 0.918527    
Gen5:ID85   -0.330739   1.344317  -0.246 0.805804    
Gen6:ID85   -0.532982   1.418961  -0.376 0.707428    
Gen3:ID87    0.807095   2.993330   0.270 0.787601    
Gen4:ID87   -0.007346   3.323370  -0.002 0.998238    
Gen5:ID87    1.159563   2.901308   0.400 0.689641    
Gen6:ID87   -0.343113   3.592151  -0.096 0.923958    
Gen3:ID89   -1.233145   1.633951  -0.755 0.450930    
Gen4:ID89   -1.278453   1.622498  -0.788 0.431251    
Gen5:ID89   -1.440109   2.240671  -0.643 0.520826    
Gen6:ID89   -1.347209   2.155311  -0.625 0.532331    
Gen3:ID90   -1.232501   1.173873  -1.050 0.294460    
Gen4:ID90   -2.327631   1.433981  -1.623 0.105438    
Gen5:ID90   -0.063752   0.990668  -0.064 0.948725    
Gen6:ID90   -1.339447   1.201562  -1.115 0.265712    
Gen3:ID92   -0.782119   1.335260  -0.586 0.558421    
Gen4:ID92   -0.710894   1.285595  -0.553 0.580634    
Gen5:ID92    1.566795   1.027148   1.525 0.128056    
Gen6:ID92   -0.261706   1.239341  -0.211 0.832880    
Gen3:ID97   -0.267420   1.654094  -0.162 0.871657    
Gen4:ID97    0.380419   1.470381   0.259 0.796000    
Gen5:ID97    0.637177   1.590949   0.401 0.689029    
Gen6:ID97    0.365301   1.672508   0.218 0.827232    
Gen3:ID98   -0.662733   1.328294  -0.499 0.618135    
Gen4:ID98   -1.269260   1.388927  -0.914 0.361422    
Gen5:ID98   -0.737178   1.469882  -0.502 0.616316    
Gen6:ID98   -1.129207   1.628438  -0.693 0.488495    
Gen3:ID100   1.457531   2.532780   0.575 0.565341    
Gen4:ID100   1.111118   2.577147   0.431 0.666627    
Gen5:ID100   2.321790   2.509014   0.925 0.355399    
Gen6:ID100   2.009830   2.554105   0.787 0.431866    
Gen3:ID101   1.900613   2.326914   0.817 0.414595    
Gen4:ID101   1.815185   2.322609   0.782 0.435014    
Gen5:ID101   1.883482   2.418857   0.779 0.436697    
Gen6:ID101   2.146589   2.376492   0.903 0.367002    
Gen3:ID104  -1.167309   1.074136  -1.087 0.277890    
Gen4:ID104  -2.606458   1.681809  -1.550 0.122084    
Gen5:ID104  -1.139406   1.065882  -1.069 0.285807    
Gen6:ID104  -2.033892   1.388089  -1.465 0.143741    
Gen3:ID105  -0.225584   1.236054  -0.183 0.855292    
Gen4:ID105  -1.442529   1.633741  -0.883 0.377857    
Gen5:ID105   0.005970   1.355634   0.004 0.996489    
Gen6:ID105  -0.146044   1.403823  -0.104 0.917202    
Gen3:ID106  -0.220900   1.708075  -0.129 0.897173    
Gen4:ID106   0.006833   1.634461   0.004 0.996667    
Gen5:ID106   1.619138   1.499301   1.080 0.280909    
Gen6:ID106  -0.144233   1.803312  -0.080 0.936296    
Gen3:ID110   0.127407   2.584739   0.049 0.960714    
Gen4:ID110   0.405886   2.287563   0.177 0.859271    
Gen5:ID110   1.083981   2.306413   0.470 0.638655    
Gen6:ID110   0.313033   2.509932   0.125 0.900818    
Gen3:ID112   0.117874   1.468375   0.080 0.936064    
Gen4:ID112  -0.027064   1.478746  -0.018 0.985408    
Gen5:ID112   1.032842   1.376415   0.750 0.453521    
Gen6:ID112   0.267973   1.529522   0.175 0.861022    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for quasipoisson family taken to be 75.14003)

    Null deviance: 163903  on 554  degrees of freedom
Residual deviance:  27271  on 354  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 5

Let’s look at the residuals:

seed_data$Fitted <- fitted(DD.glm)
seed_data <- mutate(seed_data,
                           resid2 = ((Seedlings/Nm1) - (Fitted/Nm1))^2)
ggplot(aes(1/Nm1, resid2), data = seed_data) + geom_point() + scale_y_log10() + 
  geom_smooth()
`geom_smooth()` using method = 'loess'

summary(lm(resid2 ~ I(1/Nm1), data = seed_data))

Call:
lm(formula = resid2 ~ I(1/Nm1), data = seed_data)

Residuals:
   Min     1Q Median     3Q    Max 
 -2824  -2546    -56      5 272077 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)   -11.63     777.21  -0.015   0.9881   
I(1/Nm1)     2835.73    1063.60   2.666   0.0079 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 12130 on 553 degrees of freedom
Multiple R-squared:  0.01269,   Adjusted R-squared:  0.01091 
F-statistic: 7.108 on 1 and 553 DF,  p-value: 0.007896

The regression has an intercept of zero, suggesting that none of the among-pot variation is due to “environmental stochastisity.” However, the log plot reveals a bunch of cases with a residual of zero; inspection of the dataset reveals that these are cases with only a single pot (i.e., singular values of GenID). I need to figure out how to drop those from the dataset before the analysis!

GenID_counts <- table(seed_data$GenID)
singletons <- rownames(GenID_counts)[GenID_counts == 1]
seed_data <- droplevels(seed_data[-match(singletons, seed_data$GenID), ])
DD.lm <- lm(log(Seedlings/Nm1) ~ log(Nm1) + Gen * ID, data = seed_data) 
anova(DD.lm)
Analysis of Variance Table

Response: log(Seedlings/Nm1)
           Df  Sum Sq Mean Sq  F value    Pr(>F)    
log(Nm1)    1 1072.27 1072.27 709.9570 < 2.2e-16 ***
Gen         4   88.87   22.22  14.7102  3.97e-11 ***
ID         36   42.11    1.17   0.7745    0.8234    
Gen:ID    117  163.81    1.40   0.9270    0.6820    
Residuals 354  534.66    1.51                       
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(DD.lm)

Call:
lm(formula = log(Seedlings/Nm1) ~ log(Nm1) + Gen * ID, data = seed_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.4725 -0.5346  0.0608  0.6502  2.6645 

Coefficients: (27 not defined because of singularities)
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.78599    0.87309   2.046   0.0415 *  
log(Nm1)    -0.43660    0.03108 -14.049   <2e-16 ***
Gen3         1.30529    1.22910   1.062   0.2890    
Gen4         1.20304    1.12203   1.072   0.2844    
Gen5         0.15684    1.12384   0.140   0.8891    
Gen6         0.65412    1.06459   0.614   0.5393    
ID2         -0.43844    0.86909  -0.504   0.6142    
ID3         -0.54593    0.87041  -0.627   0.5309    
ID8          0.82874    1.12189   0.739   0.4606    
ID10         1.07306    1.22904   0.873   0.3832    
ID12         1.55370    1.06446   1.460   0.1453    
ID14         0.70985    1.22899   0.578   0.5639    
ID16        -0.03284    1.22898  -0.027   0.9787    
ID20         1.17331    1.12207   1.046   0.2964    
ID24         0.74561    1.06436   0.701   0.4841    
ID30         0.53141    1.06535   0.499   0.6182    
ID37         0.94599    1.22899   0.770   0.4420    
ID43         0.91479    1.02872   0.889   0.3745    
ID45        -0.90291    0.82443  -1.095   0.2742    
ID58         0.81804    1.12246   0.729   0.4666    
ID64         0.37098    1.22907   0.302   0.7630    
ID68         0.77756    1.12198   0.693   0.4887    
ID69         0.78757    1.22896   0.641   0.5220    
ID72         1.00936    1.06444   0.948   0.3436    
ID73         0.47730    1.06987   0.446   0.6558    
ID74         0.28102    0.94494   0.297   0.7663    
ID75        -0.30177    1.06987  -0.282   0.7781    
ID78         1.13578    0.94494   1.202   0.2302    
ID80         1.40002    0.94494   1.482   0.1393    
ID83         0.12833    1.23185   0.104   0.9171    
ID85         0.63815    1.12505   0.567   0.5709    
ID87        -0.64835    1.06987  -0.606   0.5449    
ID90         1.55632    1.23185   1.263   0.2073    
ID92         0.85246    1.12505   0.758   0.4491    
ID97         0.73894    1.23185   0.600   0.5490    
ID98         1.46316    1.23185   1.188   0.2357    
ID100       -0.03773    1.23185  -0.031   0.9756    
ID101       -0.90630    1.12505  -0.806   0.4210    
ID104        2.90279    1.23185   2.356   0.0190 *  
ID106        0.08285    1.23185   0.067   0.9464    
ID110        0.23345    1.06987   0.218   0.8274    
ID112        0.46555    1.12505   0.414   0.6793    
Gen3:ID2    -0.83047    1.50534  -0.552   0.5815    
Gen4:ID2    -0.32523    1.32811  -0.245   0.8067    
Gen5:ID2     0.89882    1.28167   0.701   0.4836    
Gen6:ID2          NA         NA      NA       NA    
Gen3:ID3    -0.40130    1.50585  -0.266   0.7900    
Gen4:ID3     1.00002    1.32914   0.752   0.4523    
Gen5:ID3     1.59974    1.28191   1.248   0.2129    
Gen6:ID3          NA         NA      NA       NA    
Gen3:ID8    -1.33328    1.58682  -0.840   0.4013    
Gen4:ID8    -1.11609    1.50610  -0.741   0.4592    
Gen5:ID8     0.74749    1.50533   0.497   0.6198    
Gen6:ID8    -2.72659    1.46619  -1.860   0.0638 .  
Gen3:ID10   -1.07110    1.73845  -0.616   0.5382    
Gen4:ID10   -1.78773    1.58675  -1.127   0.2606    
Gen5:ID10    1.28495    1.58657   0.810   0.4185    
Gen6:ID10   -1.29988    1.55062  -0.838   0.4024    
Gen3:ID12   -1.21135    1.50620  -0.804   0.4218    
Gen4:ID12   -1.90996    1.39370  -1.370   0.1714    
Gen5:ID12   -0.50411    1.37402  -0.367   0.7139    
Gen6:ID12   -2.52657    1.32983  -1.900   0.0583 .  
Gen3:ID14   -0.70204    1.73810  -0.404   0.6865    
Gen4:ID14   -0.07674    1.54663  -0.050   0.9605    
Gen5:ID14    0.50740    1.54640   0.328   0.7430    
Gen6:ID14   -1.09967    1.48022  -0.743   0.4580    
Gen3:ID16    0.74469    1.73801   0.428   0.6686    
Gen4:ID16   -0.28106    1.54655  -0.182   0.8559    
Gen5:ID16    1.21610    1.54660   0.786   0.4322    
Gen6:ID16    0.22965    1.50619   0.152   0.8789    
Gen3:ID20   -0.97470    1.54646  -0.630   0.5289    
Gen4:ID20   -1.13200    1.46463  -0.773   0.4401    
Gen5:ID20    0.36067    1.41910   0.254   0.7995    
Gen6:ID20   -1.78062    1.36232  -1.307   0.1920    
Gen3:ID24         NA         NA      NA       NA    
Gen4:ID24         NA         NA      NA       NA    
Gen5:ID24   -1.23153    1.54708  -0.796   0.4265    
Gen6:ID24         NA         NA      NA       NA    
Gen3:ID30         NA         NA      NA       NA    
Gen4:ID30   -2.69798    1.54730  -1.744   0.0821 .  
Gen5:ID30    1.09795    1.55087   0.708   0.4794    
Gen6:ID30         NA         NA      NA       NA    
Gen3:ID37   -2.06866    1.73832  -1.190   0.2348    
Gen4:ID37   -0.97963    1.58658  -0.617   0.5373    
Gen5:ID37    0.30129    1.54681   0.195   0.8457    
Gen6:ID37   -1.37315    1.48010  -0.928   0.3542    
Gen3:ID43   -0.44298    1.45474  -0.305   0.7609    
Gen4:ID43   -0.70395    1.36878  -0.514   0.6074    
Gen5:ID43    0.30094    1.36712   0.220   0.8259    
Gen6:ID43   -2.30305    1.32457  -1.739   0.0830 .  
Gen3:ID45    0.27303    1.39290   0.196   0.8447    
Gen4:ID45    1.87643    1.29867   1.445   0.1494    
Gen5:ID45    1.06209    1.24938   0.850   0.3959    
Gen6:ID45         NA         NA      NA       NA    
Gen3:ID58    0.12857    1.58711   0.081   0.9355    
Gen4:ID58   -0.88546    1.50992  -0.586   0.5580    
Gen5:ID58    0.49525    1.50735   0.329   0.7427    
Gen6:ID58   -1.54672    1.46931  -1.053   0.2932    
Gen3:ID64   -0.63984    1.66409  -0.385   0.7008    
Gen4:ID64   -0.58730    1.58711  -0.370   0.7116    
Gen5:ID64    1.15752    1.52230   0.760   0.4475    
Gen6:ID64   -0.78092    1.48146  -0.527   0.5984    
Gen3:ID68   -1.29369    1.58687  -0.815   0.4155    
Gen4:ID68   -0.91634    1.50632  -0.608   0.5434    
Gen5:ID68    0.01161    1.46276   0.008   0.9937    
Gen6:ID68   -1.04929    1.42049  -0.739   0.4606    
Gen3:ID69   -0.42327    1.66409  -0.254   0.7994    
Gen4:ID69   -1.54499    1.58753  -0.973   0.3311    
Gen5:ID69    1.04075    1.54669   0.673   0.5015    
Gen6:ID69   -1.03891    1.48042  -0.702   0.4833    
Gen3:ID72   -1.00478    1.46275  -0.687   0.4926    
Gen4:ID72   -1.22906    1.37521  -0.894   0.3721    
Gen5:ID72   -0.46668    1.36089  -0.343   0.7319    
Gen6:ID72   -2.07704    1.31545  -1.579   0.1152    
Gen3:ID73         NA         NA      NA       NA    
Gen4:ID73         NA         NA      NA       NA    
Gen5:ID73         NA         NA      NA       NA    
Gen6:ID73         NA         NA      NA       NA    
Gen3:ID74         NA         NA      NA       NA    
Gen4:ID74    0.59767    1.46276   0.409   0.6831    
Gen5:ID74    2.28027    1.46334   1.558   0.1201    
Gen6:ID74         NA         NA      NA       NA    
Gen3:ID75         NA         NA      NA       NA    
Gen4:ID75         NA         NA      NA       NA    
Gen5:ID75         NA         NA      NA       NA    
Gen6:ID75         NA         NA      NA       NA    
Gen3:ID78         NA         NA      NA       NA    
Gen4:ID78         NA         NA      NA       NA    
Gen5:ID78   -1.03559    1.37465  -0.753   0.4517    
Gen6:ID78         NA         NA      NA       NA    
Gen3:ID80   -2.24140    1.54641  -1.449   0.1481    
Gen4:ID80   -0.31165    1.46276  -0.213   0.8314    
Gen5:ID80    0.54875    1.37465   0.399   0.6900    
Gen6:ID80         NA         NA      NA       NA    
Gen3:ID83    0.66062    1.73810   0.380   0.7041    
Gen4:ID83    1.41472    1.66411   0.850   0.3958    
Gen5:ID83    1.00308    1.66533   0.602   0.5473    
Gen6:ID83   -0.16410    1.54659  -0.106   0.9156    
Gen3:ID85   -0.03199    1.58668  -0.020   0.9839    
Gen4:ID85    0.56258    1.50527   0.374   0.7088    
Gen5:ID85    0.33365    1.50662   0.221   0.8249    
Gen6:ID85   -0.01573    1.46295  -0.011   0.9914    
Gen3:ID87         NA         NA      NA       NA    
Gen4:ID87         NA         NA      NA       NA    
Gen5:ID87    0.72705    1.54696   0.470   0.6387    
Gen6:ID87         NA         NA      NA       NA    
Gen3:ID90   -1.08191    1.66412  -0.650   0.5160    
Gen4:ID90   -1.99026    1.54651  -1.287   0.1990    
Gen5:ID90    0.58267    1.52323   0.383   0.7023    
Gen6:ID90   -1.52391    1.46295  -1.042   0.2983    
Gen3:ID92   -0.59698    1.58668  -0.376   0.7070    
Gen4:ID92   -0.41567    1.50527  -0.276   0.7826    
Gen5:ID92    2.20921    1.46425   1.509   0.1323    
Gen6:ID92   -0.78100    1.39243  -0.561   0.5752    
Gen3:ID97   -0.52622    1.73810  -0.303   0.7623    
Gen4:ID97    0.43335    1.66411   0.260   0.7947    
Gen5:ID97    0.94320    1.66533   0.566   0.5715    
Gen6:ID97    0.27881    1.62594   0.171   0.8639    
Gen3:ID98   -0.90303    1.66412  -0.543   0.5877    
Gen4:ID98   -1.95525    1.54651  -1.264   0.2070    
Gen5:ID98   -1.10618    1.54782  -0.715   0.4753    
Gen6:ID98   -1.95746    1.50536  -1.300   0.1943    
Gen3:ID100   0.84551    1.73810   0.486   0.6269    
Gen4:ID100   0.64140    1.66411   0.385   0.7002    
Gen5:ID100   1.58727    1.66533   0.953   0.3412    
Gen6:ID100   0.69073    1.62594   0.425   0.6712    
Gen3:ID101   1.86031    1.58668   1.172   0.2418    
Gen4:ID101   1.93286    1.50527   1.284   0.2000    
Gen5:ID101   2.29557    1.50662   1.524   0.1285    
Gen6:ID101   1.82846    1.46295   1.250   0.2122    
Gen3:ID104  -1.85765    1.73810  -1.069   0.2859    
Gen4:ID104  -3.17278    1.66411  -1.907   0.0574 .  
Gen5:ID104  -1.99215    1.54782  -1.287   0.1989    
Gen6:ID104  -2.98978    1.50536  -1.986   0.0478 *  
Gen3:ID106  -0.41737    1.66412  -0.251   0.8021    
Gen4:ID106   0.54158    1.58668   0.341   0.7331    
Gen5:ID106   1.44681    1.54782   0.935   0.3506    
Gen6:ID106  -0.34875    1.50536  -0.232   0.8169    
Gen3:ID110        NA         NA      NA       NA    
Gen4:ID110   0.25723    1.54641   0.166   0.8680    
Gen5:ID110   1.55455    1.54696   1.005   0.3156    
Gen6:ID110        NA         NA      NA       NA    
Gen3:ID112  -0.09026    1.58668  -0.057   0.9547    
Gen4:ID112  -0.06853    1.50527  -0.046   0.9637    
Gen5:ID112   1.11058    1.46425   0.758   0.4487    
Gen6:ID112  -0.39053    1.41929  -0.275   0.7834    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.229 on 354 degrees of freedom
Multiple R-squared:  0.7189,    Adjusted R-squared:  0.5934 
F-statistic: 5.729 on 158 and 354 DF,  p-value: < 2.2e-16
ggplot(aes(log(Nm1), log(Seedlings/Nm1), group = GenID, color = Gen), data = seed_data) +
  geom_point() + geom_smooth(method = "lm", se = FALSE)

DD.glm <- glm(Seedlings ~ log(Nm1) + Gen * ID, data = seed_data, family = quasipoisson) 
car::Anova(DD.glm) 
Analysis of Deviance Table (Type II tests)

Response: Seedlings
         LR Chisq  Df Pr(>Chisq)    
log(Nm1)   431.29   1  < 2.2e-16 ***
Gen        185.62   4  < 2.2e-16 ***
ID          67.75  36  0.0010645 ** 
Gen:ID     175.90 117  0.0003525 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(DD.glm)

Call:
glm(formula = Seedlings ~ log(Nm1) + Gen * ID, family = quasipoisson, 
    data = seed_data)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-25.3458   -5.7571   -0.7389    3.1541   31.1101  

Coefficients: (27 not defined because of singularities)
             Estimate Std. Error t value Pr(>|t|)    
(Intercept)  4.043179   0.451232   8.960   <2e-16 ***
log(Nm1)     0.320337   0.017984  17.812   <2e-16 ***
Gen3         0.508582   0.544455   0.934   0.3509    
Gen4         0.553890   0.509055   1.088   0.2773    
Gen5        -0.190163   0.539657  -0.352   0.7248    
Gen6        -0.182980   0.540954  -0.338   0.7354    
ID2         -0.400706   0.484754  -0.827   0.4090    
ID3         -0.021455   0.406260  -0.053   0.9579    
ID8          0.247817   0.550162   0.450   0.6527    
ID10         0.114426   0.648556   0.176   0.8601    
ID12         0.318970   0.538649   0.592   0.5541    
ID14        -0.098951   0.618344  -0.160   0.8730    
ID16         0.072229   0.597492   0.121   0.9038    
ID20        -0.183721   0.628544  -0.292   0.7702    
ID24         0.163993   0.501472   0.327   0.7438    
ID30         0.279585   0.457183   0.612   0.5412    
ID37         0.350891   0.598379   0.586   0.5580    
ID43         0.396803   0.528499   0.751   0.4533    
ID45        -0.510483   0.460247  -1.109   0.2681    
ID58         0.065428   0.623718   0.105   0.9165    
ID64         0.419852   0.607499   0.691   0.4899    
ID68        -0.108707   0.616558  -0.176   0.8601    
ID69        -0.160725   0.664430  -0.242   0.8090    
ID72         0.275853   0.537597   0.513   0.6082    
ID73         0.599945   0.733991   0.817   0.4143    
ID74        -0.504465   0.989021  -0.510   0.6103    
ID75        -1.462305   1.876139  -0.779   0.4363    
ID78        -0.276681   0.894529  -0.309   0.7573    
ID80         0.031620   0.784600   0.040   0.9679    
ID83        -1.517451   1.791424  -0.847   0.3975    
ID85        -0.251443   0.876600  -0.287   0.7744    
ID87        -1.988398   2.425787  -0.820   0.4129    
ID90         0.369619   0.811782   0.455   0.6492    
ID92        -0.379618   0.919689  -0.413   0.6800    
ID97        -0.728993   1.252908  -0.582   0.5610    
ID98        -0.225467   1.014555  -0.222   0.8243    
ID100       -2.097269   2.360245  -0.889   0.3748    
ID101       -2.369203   2.213498  -1.070   0.2852    
ID104        0.812749   0.704261   1.154   0.2493    
ID106       -0.975127   1.396775  -0.698   0.4856    
ID110       -0.839775   1.391806  -0.603   0.5466    
ID112       -1.014657   1.189767  -0.853   0.3943    
Gen3:ID2    -0.307803   0.739524  -0.416   0.6775    
Gen4:ID2     0.606803   0.605179   1.003   0.3167    
Gen5:ID2     0.466544   0.657278   0.710   0.4783    
Gen6:ID2           NA         NA      NA       NA    
Gen3:ID3    -0.276063   0.608762  -0.453   0.6505    
Gen4:ID3     0.223899   0.533329   0.420   0.6749    
Gen5:ID3     0.507727   0.558195   0.910   0.3637    
Gen6:ID3           NA         NA      NA       NA    
Gen3:ID8    -0.366896   0.682695  -0.537   0.5913    
Gen4:ID8    -0.934527   0.673860  -1.387   0.1664    
Gen5:ID8     0.763515   0.655545   1.165   0.2449    
Gen6:ID8    -1.804249   0.837205  -2.155   0.0318 *  
Gen3:ID10   -0.419608   0.803372  -0.522   0.6018    
Gen4:ID10   -0.094649   0.735702  -0.129   0.8977    
Gen5:ID10    1.060966   0.737561   1.438   0.1512    
Gen6:ID10    0.020512   0.755992   0.027   0.9784    
Gen3:ID12   -0.378844   0.658266  -0.576   0.5653    
Gen4:ID12   -0.958024   0.635016  -1.509   0.1323    
Gen5:ID12    0.357942   0.641027   0.558   0.5769    
Gen6:ID12   -1.262101   0.701813  -1.798   0.0730 .  
Gen3:ID14    0.121812   0.754102   0.162   0.8718    
Gen4:ID14   -0.512795   0.736078  -0.697   0.4865    
Gen5:ID14    0.780078   0.717626   1.087   0.2778    
Gen6:ID14   -0.045252   0.735141  -0.062   0.9510    
Gen3:ID16   -0.061494   0.741496  -0.083   0.9340    
Gen4:ID16   -1.202151   0.759309  -1.583   0.1143    
Gen5:ID16    0.342530   0.717455   0.477   0.6334    
Gen6:ID16    0.100265   0.712884   0.141   0.8882    
Gen3:ID20   -0.151832   0.760273  -0.200   0.8418    
Gen4:ID20    0.257355   0.700666   0.367   0.7136    
Gen5:ID20    1.222094   0.712657   1.715   0.0872 .  
Gen6:ID20    0.079012   0.729305   0.108   0.9138    
Gen3:ID24          NA         NA      NA       NA    
Gen4:ID24          NA         NA      NA       NA    
Gen5:ID24    0.246244   0.690904   0.356   0.7217    
Gen6:ID24          NA         NA      NA       NA    
Gen3:ID30          NA         NA      NA       NA    
Gen4:ID30   -1.831454   0.834158  -2.196   0.0288 *  
Gen5:ID30    0.222646   0.677205   0.329   0.7425    
Gen6:ID30          NA         NA      NA       NA    
Gen3:ID37   -1.148642   0.807422  -1.423   0.1557    
Gen4:ID37   -0.211171   0.697251  -0.303   0.7622    
Gen5:ID37    0.222909   0.709411   0.314   0.7535    
Gen6:ID37   -0.468397   0.726377  -0.645   0.5194    
Gen3:ID43   -0.394548   0.645279  -0.611   0.5413    
Gen4:ID43   -0.235858   0.602237  -0.392   0.6956    
Gen5:ID43    0.460569   0.625507   0.736   0.4620    
Gen6:ID43   -1.422365   0.692022  -2.055   0.0406 *  
Gen3:ID45   -0.255481   0.698781  -0.366   0.7149    
Gen4:ID45    0.615898   0.578706   1.064   0.2879    
Gen5:ID45    0.749337   0.603219   1.242   0.2150    
Gen6:ID45          NA         NA      NA       NA    
Gen3:ID58    0.312379   0.732627   0.426   0.6701    
Gen4:ID58   -0.298223   0.705544  -0.423   0.6728    
Gen5:ID58    0.897457   0.714474   1.256   0.2099    
Gen6:ID58   -0.269180   0.748447  -0.360   0.7193    
Gen3:ID64   -0.025388   0.725404  -0.035   0.9721    
Gen4:ID64   -1.448796   0.752154  -1.926   0.0549 .  
Gen5:ID64   -0.005864   0.723116  -0.008   0.9935    
Gen6:ID64   -0.960080   0.750582  -1.279   0.2017    
Gen3:ID68   -0.650878   0.787310  -0.827   0.4090    
Gen4:ID68   -0.517148   0.735852  -0.703   0.4826    
Gen5:ID68    0.511472   0.728950   0.702   0.4834    
Gen6:ID68   -0.089564   0.748860  -0.120   0.9049    
Gen3:ID69    0.678626   0.769246   0.882   0.3783    
Gen4:ID69   -1.151739   0.812405  -1.418   0.1572    
Gen5:ID69    0.913132   0.763121   1.197   0.2323    
Gen6:ID69    0.041416   0.775749   0.053   0.9575    
Gen3:ID72   -0.107015   0.650276  -0.165   0.8694    
Gen4:ID72   -1.128138   0.636728  -1.772   0.0773 .  
Gen5:ID72    0.059203   0.644784   0.092   0.9269    
Gen6:ID72   -1.197553   0.694305  -1.725   0.0854 .  
Gen3:ID73          NA         NA      NA       NA    
Gen4:ID73          NA         NA      NA       NA    
Gen5:ID73          NA         NA      NA       NA    
Gen6:ID73          NA         NA      NA       NA    
Gen3:ID74          NA         NA      NA       NA    
Gen4:ID74   -0.049553   1.300382  -0.038   0.9696    
Gen5:ID74    1.156798   1.216100   0.951   0.3421    
Gen6:ID74          NA         NA      NA       NA    
Gen3:ID75          NA         NA      NA       NA    
Gen4:ID75          NA         NA      NA       NA    
Gen5:ID75          NA         NA      NA       NA    
Gen6:ID75          NA         NA      NA       NA    
Gen3:ID78          NA         NA      NA       NA    
Gen4:ID78          NA         NA      NA       NA    
Gen5:ID78   -0.564074   1.455171  -0.388   0.6985    
Gen6:ID78          NA         NA      NA       NA    
Gen3:ID80   -1.426381   1.518337  -0.939   0.3481    
Gen4:ID80   -0.551152   1.142841  -0.482   0.6299    
Gen5:ID80    0.866075   0.957986   0.904   0.3666    
Gen6:ID80          NA         NA      NA       NA    
Gen3:ID83    0.847253   2.019075   0.420   0.6750    
Gen4:ID83    1.695294   1.892558   0.896   0.3710    
Gen5:ID83    0.755477   2.237085   0.338   0.7358    
Gen6:ID83    0.977479   2.050248   0.477   0.6338    
Gen3:ID85   -0.539119   1.201553  -0.449   0.6539    
Gen4:ID85   -0.111520   1.089472  -0.102   0.9185    
Gen5:ID85   -0.330739   1.344317  -0.246   0.8058    
Gen6:ID85   -0.532982   1.418961  -0.376   0.7074    
Gen3:ID87          NA         NA      NA       NA    
Gen4:ID87          NA         NA      NA       NA    
Gen5:ID87    1.502677   2.695457   0.557   0.5775    
Gen6:ID87          NA         NA      NA       NA    
Gen3:ID90   -1.232501   1.173873  -1.050   0.2945    
Gen4:ID90   -2.327631   1.433981  -1.623   0.1054    
Gen5:ID90   -0.063752   0.990668  -0.064   0.9487    
Gen6:ID90   -1.339447   1.201562  -1.115   0.2657    
Gen3:ID92   -0.782119   1.335260  -0.586   0.5584    
Gen4:ID92   -0.710894   1.285595  -0.553   0.5806    
Gen5:ID92    1.566795   1.027148   1.525   0.1281    
Gen6:ID92   -0.261706   1.239341  -0.211   0.8329    
Gen3:ID97   -0.267420   1.654094  -0.162   0.8717    
Gen4:ID97    0.380419   1.470381   0.259   0.7960    
Gen5:ID97    0.637177   1.590949   0.401   0.6890    
Gen6:ID97    0.365301   1.672508   0.218   0.8272    
Gen3:ID98   -0.662733   1.328294  -0.499   0.6181    
Gen4:ID98   -1.269260   1.388927  -0.914   0.3614    
Gen5:ID98   -0.737178   1.469882  -0.502   0.6163    
Gen6:ID98   -1.129207   1.628438  -0.693   0.4885    
Gen3:ID100   1.457531   2.532780   0.575   0.5653    
Gen4:ID100   1.111118   2.577147   0.431   0.6666    
Gen5:ID100   2.321790   2.509014   0.925   0.3554    
Gen6:ID100   2.009830   2.554105   0.787   0.4319    
Gen3:ID101   1.900613   2.326914   0.817   0.4146    
Gen4:ID101   1.815185   2.322609   0.782   0.4350    
Gen5:ID101   1.883482   2.418857   0.779   0.4367    
Gen6:ID101   2.146589   2.376492   0.903   0.3670    
Gen3:ID104  -1.167309   1.074136  -1.087   0.2779    
Gen4:ID104  -2.606458   1.681809  -1.550   0.1221    
Gen5:ID104  -1.139406   1.065882  -1.069   0.2858    
Gen6:ID104  -2.033892   1.388089  -1.465   0.1437    
Gen3:ID106  -0.220900   1.708075  -0.129   0.8972    
Gen4:ID106   0.006833   1.634461   0.004   0.9967    
Gen5:ID106   1.619138   1.499301   1.080   0.2809    
Gen6:ID106  -0.144233   1.803312  -0.080   0.9363    
Gen3:ID110         NA         NA      NA       NA    
Gen4:ID110   0.092853   1.670387   0.056   0.9557    
Gen5:ID110   0.770948   1.696050   0.455   0.6497    
Gen6:ID110         NA         NA      NA       NA    
Gen3:ID112   0.117874   1.468375   0.080   0.9361    
Gen4:ID112  -0.027064   1.478746  -0.018   0.9854    
Gen5:ID112   1.032842   1.376415   0.750   0.4535    
Gen6:ID112   0.267973   1.529522   0.175   0.8610    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for quasipoisson family taken to be 75.14003)

    Null deviance: 147594  on 512  degrees of freedom
Residual deviance:  27271  on 354  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 5
seed_data$Fitted <- fitted(DD.glm)
seed_data <- mutate(seed_data,
                           resid2 = ((Seedlings/Nm1) - (Fitted/Nm1))^2)
ggplot(aes(1/Nm1, resid2), data = seed_data) + geom_point() + scale_y_log10() + 
  geom_smooth()
`geom_smooth()` using method = 'loess'

summary(lm(resid2 ~ I(1/Nm1), data = seed_data))

Call:
lm(formula = resid2 ~ I(1/Nm1), data = seed_data)

Residuals:
   Min     1Q Median     3Q    Max 
 -3131  -2730    -24     15 271770 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)   
(Intercept)   -21.57     829.56  -0.026   0.9793   
I(1/Nm1)     3152.28    1149.10   2.743   0.0063 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 12600 on 511 degrees of freedom
Multiple R-squared:  0.01451,   Adjusted R-squared:  0.01258 
F-statistic: 7.525 on 1 and 511 DF,  p-value: 0.006297

This does the trick. We still have a zero intercept. There appears to be one singleton remaining. The construction of seed_data is now complex enough that it should be in a munge script. And I still need to work out how to do the random effects model!

7 June 2017

Taking stock

Here’s what we “knew” on May 1:

  • In the single-plant pots, average home seed production is consistent across generations 2-5, but is high in gen 1 and low in gen 6.
  • In the single-plant pots, the variance in home seed production seems lowest in gens 1, 3, and 6, and highest in gen 2 (this is based on looking at the IQR from the boxplots; I haven’t actually calculated the variances yet).
  • In the population runs, the density-dependent pattern of seed production looks, in expectation, the same across all generations except 6, which runs low. I can’t actually measure the density-dependent curve in gen 1, but the observations at \(N=50\) are consistent with the curves from gens 2-5. I have not looked at whether the residual variance differs among generations.

Note that these were based on “unclean” data, but reanalysis with clean data led to the same conclusions (May 19).

Further observations:

  • The number of home-pot seedlings in Ler B-3p declines linearly with time (19 May; looking at sqrt(seedlings) as the response). It’s not clear whether this is a pattern of declining seed production or one of increasing dispersal efficacy.
  • When looking at density dependence (log(Seedlings/Adults) ~ log(Adults), using 3p landscapes in both B and C), there appears to be a quadratic effect of generation, but this is largely driven by generation 6 (19 May)
  • Per-generation spread rates in the continuous runways (C-0p) were a quadratic function of time, had negative autocorrelation (\(\rho = -0.33\)), and weak evidence for differences among runways (\(P = 0.075\)) (24 May).
  • Per-generation spread rates in all fragmented runways were indedpendent of generation and replicate, and had no autocorrelation (24 May).
  • Analysis of cumulative spread (24 May) showed that the linear approximation to mean spread is pretty good, although we see a dip in generation 6, as expected from the reduced seed production, and even without gen 6, the continuous runways seem to be decelerating. The ranking between landscapes makes sense. The variances are rather more over the map, although we don’t have confidence intervals on them. But exept for 1-pot gaps, we can squint and imagine that the variances are linear in time. Note, however, that with only 6 generations we’re not going to easily detect nonlinearities. I don’t know that the CVs tell us much.
  • Per-generation spread in the evolution experiment (24 May) are a bit of a mixed bag:
    • For evolution YES, we have:
      • Autocorrelation in gaps 0, 2, 3
      • Gen effects in gap 2
    • For evolution NO, we have:
      • Autocorrelation in gaps 1
      • Gen effects in gaps 0, 1
  • Cumulative spread in the evolution experiment (24 May) shows more-or-less linear growth in the mean (though perhaps some decelleration in 2p); but the variance seems to level out in the evolution treatments showing autocorrelated spread. We need confidence intervals on these, however.
  • In treatment B, seedling production is effectively independent of silique number! (2 June) This is found both by looking at home pot seedlings in gappy landscapes and at total seedlings in the runway of the first generation of the continuous landscape.
  • Looking at home-pot seedling production in Ler B & C, 1p and 2p (which should be mostly locally produced, but colonizes multiple pots allowing us to look at runway effects), there are:
    • Density dependence (log-log model)
    • Variance among generations
    • Variance among runways within a generation
    • Variance among pots within a runway that goes to zero in the limit of large N, suggesting that it is just demographic stochascitiy (as previously seen, however, this demographic stochasticity is large)

So far I’ve addressed 2, 3, 10, and 11 in the May 1 “remaining empirical questions.”

Strategy for a paper

Here’s what I think we want to lay out for the motivation of a paper:

  • We know there are lots of sources of stochasticity in spreading populations
  • This could lead to “unpredictability” in spread rates
  • Replicated lab populations have empirically demonstrated this unpredictability
  • However, we don’t know how various sources of stochasticity and heterogeneity contribute to spread rate unpredictability
    • This could be important if some of those sources can themselves be predicted
    • Need to double-check what Melbourne & Hastings did; there are some other experimental spread experiments (without evolution, or without evolutionary controls) to check.
  • In this paper, we do three things:
    • Develop a simple model of stochastic spread
    • Demonstrate how different sources of stochasticity contribute to spread variability in a monotypic plant population
    • Demonstrate how evolution in a spreading population might make spread more or less predictable.

11 Sep 2017

Lewis et al (2016)

Mark Lewis’s new book has a whole chapter on “stochastic spread:”

Our focus is on the rate at which such processes spread spatially. We first examine the effects of environmental stochasticity on spatial spread by means of stochastic integrodifference and reaction–diffusion models. Here, we analyze both the wave solution for the expected density of individuals and the wave solution for a given realization of the stochastic process, as well as the variability that this can exhibit. Then we turn our analysis to the effects of demographic stochasticity on spatial spread. This can be described by its effects on the expected density and also by its effects on the velocity of the furthest-forward location for the population. Finally, we consider nonlinear stochastic models for patchy spread of invasive species, showing how patchiness in the leading edge of an invasion process can dramatically slow the invasive spreading speed. (p. 233)

Environmental stochasticity

The stochastic integrodifference model has a time varying population growth function and a time-varying dispersal kernel–but both are assumed to be spatially uniform. Here’s a useful commentary on stochastic kernels:

A dispersal kernel is a probability density function of a random variable, that random variable being the location of a dispersed offspring. In other words, a dispersal kernel is a guide, or template, prescribing how likely an offspring is to fall in a given infinitesimal location. As such, it is typically an ordinary function and is not usually a random variable itself. What, then, do we mean by referring to the kt as “random dispersal kernels”? We mean that the function kt is not known in advance but rather is defined randomly. For example, it could be that \(k_t = N(0,\nu)\) where \(\nu\) is a random variable. Thus, even the template for dispersal, and not just dispersal itself, is stochastic.

If growth and dispersal variability are uncorrelated, then the “average population” spreads at rate given by the average environment. They call this the speed of the “expectation wave.” Positive autocorrelation between growth and dispersal will speed things up.

They then turn to the “stochastic wave,” which has location \(X_t\) and “average” speed (i.e., averaged from time 0 to time \(t\)) \(\bar{C_t}\). Assuming an exponential spatial distribution at time 0, They derive values for the expectation and variance of \(\bar{C_t}\) that depends on steepness of the initial condition and the moment generating function of the “representative” kernel (they don’t really say how this is defined; it’s actually introduced in the previous analysis). They then show that, whereas \(\bar{C_t}\) converges to its expectation as \(t\) gets large, the variance of \(X_t\) increases linearly with time (p. 239; fig. 8.1). This is the “simple model” that I was going to develop; here it’s already been done for me!

One thing they note is that (at least with iid/independence between dispersal and growth) the speed of the stochastic wave will be less than the speed of the expectation wave. Thus, stochasticity slows spread, on average.

A key reference for this work seems to be Neubert, M.G., Kot, M., Lewis, M.A.: Invasion speeds in fluctuating environments. Proc. R. Soc. B 267, 1603–1610 (2000). doi:10.1098/rspb.2000.1185

In fact, fig. 8.1 above is fig 4 in Neubert et al (2000), which also has the observation about the linear growth in \(var(X_t)\) at the bottom of p. 1606. This is what we should actually cite!

They point out that it has been extended to structured populations by Schreiber, S.J., Ryan, M.E.: Invasion speeds for structured populations in fluctuating environments. Theor. Ecol. 4, 423–434 (2011). doi:10.1007/s12080-010-0098-5

They also briefly consider a stochastic reaction-diffusion equation. There, spatial autocorrelation in the temporally-varying growth environment leads to speeding up of spread.

Demographic stochasticity

Starts with two lab examples: Tribolium from Melbourne and Hastings (2009), and ciliates from Giometto, A., Rinaldo, A., Carrara, F., Altermatt, F.: Emerging predictable features of replicated biological invasion fronts. Proc. Natl. Acad. Sci. U. S. A. 111(1), 297–301 (2014). doi:10.1073/pnas.1321167110 [this paper uses a stochastic PDE to fit the dynamics]. They accept the authors’ respective claims that Tribolium spread “could not be predicted precisely” but ciliate spread could be “predicted well.” But the former claim refers to a single realization whereas the latter refers to the mean and variance across replicates.

Models henceforth focus on furthest-forward individual. Brief treatment of continuous-time linear branching process, which I don’t think is too relevant here. Then an extended treatment of Clark et al’s “invasion by extremes” paper. They use a 1-paremter simplification of the 2Dt kernel: \[ k(x) = \frac{1}{2\sqrt{2\beta} \left(1 + \frac{x^2}{2\beta}\right)^{3/2}}; \] not sure if this was Clark’s original choice. The section doesn’t treat variance in spread but instead focus on mean spread, with lower and upper bounds depending on assumptions about contributions from individuals behind the front.

They finish the chapter by talking about “patchy spread,” but here they mean something like stratified diffusion rather than a patchy environment. There’s probably some interesting stuff here but not needed for our current work.

Neubert et al 2000

Confirmed that fig. 8.1 above is fig 4 in Neubert et al (2000), which also has the observation about the linear growth in \(var(X_t)\) at the bottom of p. 1606. This is what we should actually cite!

Looked through the list of papers citign this work, and nothing new jumped out at me. In fact, it seems that none of the experimental spread papers cite this work!

13 Sep 2017

Dispersal kernels

Quick looks at dispersal. Lewis et al 2008 show how to calculate empirical moment generating functions, but I’m not sure how useful that is for us. Nathan et al. 2012 has a good review of kernels, but not much statistical info. I found Viana et al. 2016 which suggests fitting the CDF is better than fitting the PDF.

Viana also led my to the fitdistrplus package, which fits standard distributions. I think this is what we need–I was increasinglyl unconfident im my home-rolled fitting routines. The only rub is that I’ll need to write d, p and q functions for any “special” distributions (e.g., 2Dt).

19 September 2017

Dispersal data

Jenn sent info about the Ler and RIL dispersal data.

Ler

The data from the sticky paper kernels for Ler are called 2013_08_08_Exp1_Spray.csv. And I’m attaching the script I used to analyze them [FitKernels_Exp1_SPRAY_aug2013.R (in Arabidopsis/analysis)].

I fit negative exponential kernels, which I know isn’t necessarily the best fit, but is what I needed for the simulations I ran (for the density dependence paper). For reasons I don’t know, the mean dispersal distance on sticky paper was further than in Generation 1. This of course assumes all seeds would have germinated, but we know mostly they do. Number of moms in control ranged from 2- 1200, but I found no significant relationship between density and dispersal distance. I used the estimate from the sticky paper data to keep m = 2 in my models (mean dispersal distance of ~0.5 pots). Otherwise, my simulated invasions moved much more slowly than they did in reality.

I then asked

I’m not quite sure how to interpret the Ler data you sent – e.g., sometimes “pot” is > 0, but most of the seeds in “pot_gen3” are in pot 0.

Jenn replied

As for the Ler data, we had started the experiment several generations earlier and then for a number of reasons decided to scrap it and start over. So we then took pots from a wide range of densities (from different places in the runways) to put next to the sticky paper. This means they weren’t all pot 0. Shouldn’t matter which pot they are (at least that was the idea). The pot numbers are just relics of this past.

RILs

You also asked about sticky paper data for the RILs. I don’t think we ever dispersed the RILs onto sticky paper, but we did have 2 trials onto empty pots (with solitary moms). Script that I used attached, which references the appropriate csv files. [FitDispersal_RILs_30May2016.R (in Arabidopsis/analysis)]

I asked

It looks like I’ve already used “2015_06_30_RilsDispersal.csv”; I’ll add the 2016 data (is there any reason not to combine them?).

Jenn replied

I don’t recall any reason not to combine the 2015 & 2016 RIL data. That said, when I ran through that script, it definitely looks like you can get a really different set of ranks depending on which data you use. I’ll look in the morning (at my paper notes) to see if there’s anything obviously different about how the data were collected.

I think that the RIL data were the ones where I identified some data entry issues when I was in Zurich. I’ll need to double check that this file has the corrected data.

File to load Ler dispersal data

Here is the new file data/disperseLer.R:

### Creates the data object disperseLer, representing the Ler dispersal experiment

# Get data from Jenn's file
data_dir <- "~/Dropbox/Arabidopsis/analysis"
disperseLer <- read.csv(file.path(data_dir, '2013_08_08_Exp1_Spray.csv'), 
                        header = TRUE)

# Drop the "clipped" treatment
disperseLer <- droplevels(subset(disperseLer, new_trt != "clipped", drop = TRUE))

# Drop the columns with the (irrelevant) info about where the mom pots came from
disperseLer <- disperseLer[, -c(1:4, 6)]

# Clean up column names 
names(disperseLer) <- c("ID", "Pot", "Distance", "Seedlings", "Siliques", "Density",
                        "Treatment")

# Make some factor variables
disperseLer$ID <- as.factor(disperseLer$ID)

# Clean up the workspace
rm(data_dir)

# Auto-cache the data
ProjectTemplate::cache("disperseLer")

Summarizing Ler dispersal data

Here is a file to summarize the data by replicate, in lib/helpers.R:

calc_dispersal_stats <- function(dispersal_data) {
  dispersal_stats <- group_by(dispersal_data, ID, Density, Siliques) %>%
    summarise(
      Total_seeds = sum(Seedlings),
      Home_seeds = Seedlings[1],
      Dispersing_seeds = Total_seeds - Home_seeds,
      Dispersal_fraction = Dispersing_seeds / Total_seeds,
      # Dispersal stats including all seeds
      Mean_dispersal_all = sum(Seedlings * (Distance - 4)) / Total_seeds,
      RMS_dispersal_all = sqrt(sum(Seedlings * (Distance - 4)^2) / Total_seeds),
      # Dispersal stats including just dispersing seeds
      Mean_dispersal = sum(Seedlings * (Distance - 4)) / Dispersing_seeds,
      RMS_dispersal = sqrt(sum(Seedlings * (Distance - 4)^2) / Dispersing_seeds)
    )
  return (dispersal_stats)
}

Let’s look at how the total seed number, the fraction dispersing, and the dispersal distances scale with density:

Ler_dispersal_stats <- calc_dispersal_stats(disperseLer)
qplot(Density, Total_seeds, data = Ler_dispersal_stats)

qplot(Density, Dispersal_fraction, data = Ler_dispersal_stats)

qplot(Density, Mean_dispersal_all, data = Ler_dispersal_stats)

qplot(Density, Mean_dispersal, data = Ler_dispersal_stats)

qplot(Density, RMS_dispersal_all, data = Ler_dispersal_stats)

qplot(Density, RMS_dispersal, data = Ler_dispersal_stats)

Except for total seed number, there don’t seem to be any patterns here. For the single plants, let’s look at whether silique number (which might be correlated with plant size) can predict anything:

qplot(Siliques, Total_seeds, data = subset(Ler_dispersal_stats, !is.na(Siliques)))

qplot(Siliques, Dispersal_fraction, data = subset(Ler_dispersal_stats, !is.na(Siliques)))

qplot(Siliques, Mean_dispersal_all, data = subset(Ler_dispersal_stats, !is.na(Siliques)))

qplot(Siliques, Mean_dispersal, data = subset(Ler_dispersal_stats, !is.na(Siliques)))

qplot(Siliques, RMS_dispersal_all, data = subset(Ler_dispersal_stats, !is.na(Siliques)))

qplot(Siliques, RMS_dispersal, data = subset(Ler_dispersal_stats, !is.na(Siliques)))

Again, no patterns. This makes modeling simpler!

One more thing to look at: are the mean dispersal distances correlated with either the fraction dispersing or the number of dispersing seeds?

qplot(Dispersing_seeds, Mean_dispersal, data = Ler_dispersal_stats)

qplot(Dispersing_seeds, RMS_dispersal, data = Ler_dispersal_stats)

qplot(Dispersal_fraction, Mean_dispersal, data = Ler_dispersal_stats)

qplot(Dispersal_fraction, RMS_dispersal, data = Ler_dispersal_stats)

Here we see a correlation with both. This could just be a sampling effect; once we fit kernel parameters we’ll have to see if they have correlations as well. If so, that suggests that the same variables (such as canopy height and structure, and intensity of simulated rainfall) that affect whether seeds leave the home pot also affect how far they go.

Here’s a multiple regression just to tease out the two variables:

summary(lm(Mean_dispersal ~ Dispersal_fraction + Dispersing_seeds, data = Ler_dispersal_stats))

Call:
lm(formula = Mean_dispersal ~ Dispersal_fraction + Dispersing_seeds, 
    data = Ler_dispersal_stats)

Residuals:
    Min      1Q  Median      3Q     Max 
-1.7144 -0.5299 -0.2285  0.3843  4.4228 

Coefficients:
                    Estimate Std. Error t value Pr(>|t|)    
(Intercept)        6.0255317  0.4449107  13.543 1.76e-15 ***
Dispersal_fraction 2.5229534  1.1204944   2.252   0.0307 *  
Dispersing_seeds   0.0016311  0.0008397   1.942   0.0602 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.148 on 35 degrees of freedom
Multiple R-squared:  0.3358,    Adjusted R-squared:  0.2979 
F-statistic: 8.849 on 2 and 35 DF,  p-value: 0.000776
summary(lm(RMS_dispersal ~ Dispersal_fraction + Dispersing_seeds, data = Ler_dispersal_stats))

Call:
lm(formula = RMS_dispersal ~ Dispersal_fraction + Dispersing_seeds, 
    data = Ler_dispersal_stats)

Residuals:
    Min      1Q  Median      3Q     Max 
-31.558 -11.516  -5.909   7.545  97.275 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)        40.98162    9.35833   4.379 0.000103 ***
Dispersal_fraction 44.97236   23.56868   1.908 0.064603 .  
Dispersing_seeds    0.03231    0.01766   1.829 0.075904 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 24.15 on 35 degrees of freedom
Multiple R-squared:  0.2861,    Adjusted R-squared:  0.2453 
F-statistic: 7.015 on 2 and 35 DF,  p-value: 0.002743

It appears that both are playing a role. But the dispersal fraction is a stronger predictor, which suggests that there is real mechanism here, and not just sampling getting further into the kernel.

Kernel fitting issues

Our data are actually interval-censored, and we need to ensure that we are taking that into account. I believe that the likelihood functions I wrote previously take that into account, but they lack the rigor of fitdistrplus. Unfortunately, the data have another issue: the seeds in pot zero are a mixture of “non-dispersers” (i.e., seeds that fall straight down and aren’t part of the kernel) and “short-distance dispersers” (seeds moving according to a dispersal kernel, but traveling less than 3 cm). To account for this either requires writing a zero-inflated distribution (which will require custom functions for everything) or telling the objective function to truncate the distribution. Again I believe that my prior functions do the latter, but I’m not truly confident of that. Unfortunately, there doesn’t seem to be an obvious way to account this in fitdistrplus using out-of-the-box distributions.

Actually, I take that back—the issue of truncation is addressed in the FAQs. It does require writing new distribution functions, but they are pretty straightforward. Here is their example for a (doubly) truncated exponential distribution:

dtexp <- function(x, rate, low, upp)
{
  PU <- pexp(upp, rate=rate)
  PL <- pexp(low, rate=rate)
  dexp(x, rate) / (PU-PL) * (x >= low) * (x <= upp) 
}
ptexp <- function(q, rate, low, upp)
{
  PU <- pexp(upp, rate=rate)
  PL <- pexp(low, rate=rate)
  (pexp(q, rate)-PL) / (PU-PL) * (q >= low) * (q <= upp) + 1 * (q > upp)
}

Looking at the dispersal patterns

Let’s start by just plotting the raw data. The first plot is just the empirical kernels; the second puts the vertical axis on a log scale (so exponential distributions should look linear and normal distributions should look quadratic) and the third puts both axes on a log scale (so lognormal distributions should look quadratic).

ggplot(aes(x = Distance, y = Seedlings), data = subset(disperseLer, Distance > 4)) +
  geom_line() + facet_wrap(~ID, scales = "free_y") 

ggplot(aes(x = Distance, y = Seedlings), data = subset(disperseLer, Distance > 4)) +
  geom_line() + facet_wrap(~ID, scales = "free_y") + scale_y_log10()

ggplot(aes(x = Distance, y = Seedlings), data = subset(disperseLer, Distance > 4)) +
  geom_line() + facet_wrap(~ID, scales = "free_y") + scale_y_log10() + scale_x_log10()

In many cases, it seems like a (truncated) normal distribution might be best! Rep 87_1 is going to be particularly problematic, as it is strongly sigmoid.

Trial fitting

I’ll start by selecting a single rep with a large number of seeds and a well-behaved distribution. Let’s use 87_0.

Pull out the data and convert into the format required by fitdistrplus:

data_87_0 <- subset(disperseLer, ID == "87_0" & Distance > 4)
data_vec <- rep(data_87_0$Distance, data_87_0$Seedlings)
cens_data_tble <- data.frame(left = data_vec - 4.5, right = data_vec - 3.5)

Now construct the distribution functions for a truncated normal. I code in the “standard” truncation level.

dtnorm <- function(x, mean, sd, low = 3.5)
{
  PU <- 1
  PL <- pnorm(low, mean = mean, sd = sd)
  dnorm(x, mean, sd) / (PU-PL) * (x >= low) 
}
ptnorm <- function(q, mean, sd, low = 3.5)
{
  PU <- 1
  PL <- pnorm(low, mean = mean, sd = sd)
  (pnorm(q, mean, sd)-PL) / (PU-PL) * (q >= low)
}

So now let’s give it a try:

library(fitdistrplus)
Loading required package: MASS

Attaching package: 'MASS'
The following object is masked from 'package:dplyr':

    select
Loading required package: survival
Loading required package: methods
fit_tnorm <- fitdistcens(cens_data_tble, "tnorm", start = list(mean = 6, sd = 10))
summary(fit_tnorm)
Fitting of the distribution ' tnorm ' By maximum likelihood on censored data 
Parameters
     estimate Std. Error
mean 7.681504  0.5930568
sd   5.588817  0.3661223
Fixed parameters:
data frame with 0 columns and 0 rows
Loglikelihood:  -1242.553   AIC:  2489.106   BIC:  2497.343 
Correlation matrix:
          mean        sd
mean  1.000000 -0.803976
sd   -0.803976  1.000000
cdfcompcens(fit_tnorm, Turnbull.confint = TRUE)

This looks like a pretty darn good fit!

Interestingly, using the non-censored method gives almost identical parameter estimates and CDF:

fit3<-fitdist(data_vec - 4, "tnorm", start = list(mean = 6, sd = 10))
summary(fit3)
Fitting of the distribution ' tnorm ' by maximum likelihood 
Parameters : 
     estimate Std. Error
mean 7.696363  0.5870556
sd   5.583054  0.3626565
Loglikelihood:  -1242.376   AIC:  2488.752   BIC:  2496.988 
Correlation matrix:
          mean        sd
mean  1.000000 -0.801385
sd   -0.801385  1.000000
cdfcomp(fit3)

Thus, we can use some of the other diagnostic plots for non-censored data:

denscomp(fit3, breaks = 3:23+0.5, plotstyle = "ggplot", xlim = c(3, 24))

# qqcomp(fit3) # This one requires defining a quantile function!
ppcomp(fit3)

Unfortunately, we need additional work to get qqcomp working; and denscomp only gives a sensible looking histogram with tweaking (and even there, the bins are not centered appropriately, so the histogram is shifted).

Here’s a rough draft of a better figure:

xx = seq(3.3, max(fit3$data) + 0.7, 0.1)
plot(xx, (dtnorm(xx, fit3$estimate[1], fit3$estimate[2])), type = "l", col="red")
points(table(fit3$data)/length(fit3$data))

We can make analagous functions for the log normal distribution:

dtlnorm <- function(x, meanlog, sdlog, low = 3.5)
{
  PU <- 1
  PL <- plnorm(low, meanlog = meanlog, sdlog = sdlog)
  dlnorm(x, meanlog, sdlog) / (PU-PL) * (x >= low) 
}
ptlnorm <- function(q, meanlog, sdlog, low = 3.5)
{
  PU <- 1
  PL <- plnorm(low, meanlog = meanlog, sdlog = sdlog)
  (plnorm(q, meanlog, sdlog)-PL) / (PU-PL) * (q >= low)
}

Now fit to the censored data and compare with our normal fit

fit_tlnorm <- fitdistcens(cens_data_tble, "tlnorm", start = list(meanlog = log(6), sdlog = log(10)))
summary(fit_tlnorm)
Fitting of the distribution ' tlnorm ' By maximum likelihood on censored data 
Parameters
         estimate Std. Error
meanlog 2.1737212 0.02373376
sdlog   0.4584419 0.01879380
Fixed parameters:
data frame with 0 columns and 0 rows
Loglikelihood:  -1254.488   AIC:  2512.976   BIC:  2521.212 
Correlation matrix:
           meanlog      sdlog
meanlog  1.0000000 -0.2594226
sdlog   -0.2594226  1.0000000
cdfcompcens(list(fit_tnorm, fit_tlnorm), Turnbull.confint = TRUE, 
            legendtext = c("truncated normal", "truncated lognormal"))

The CDF of the log-normal is not as good, and the AIC is about 24 units worse.

Fit all the replicates

Start by writing a function that takes a single replicate’s data and does the analysis

fit_dispersal_models <- function(dispersal_data) {
  data_loc <- subset(dispersal_data, Distance > 4)
  data_vec <- rep(data_loc$Distance, data_loc$Seedlings)
  cens_data_tble <- data.frame(left = data_vec - 4.5, right = data_vec - 3.5)
  fit_tnorm <- fitdistcens(cens_data_tble, "tnorm", start = list(mean = 6, sd = 10))
  fit_tlnorm <- fitdistcens(cens_data_tble, "tlnorm", start = list(meanlog = log(6), sdlog = log(10)))
  cdfcompcens(list(fit_tnorm, fit_tlnorm), Turnbull.confint = TRUE, main = dispersal_data$ID[1],
              legendtext = c("truncated normal", "truncated lognormal"))
  stnorm <- summary(fit_tnorm)
  stlnorm <- summary(fit_tlnorm)
  data.frame(ID = dispersal_data$ID[1], AICnorm = stnorm$aic, AIClnorm = stlnorm$aic, 
             mu = stnorm$est[1], sd = stnorm$est[2],
             mulog = stlnorm$est[1], sdlog = stlnorm$est[2])
}

Test it out:

fit_dispersal_models(subset(disperseLer, ID == "87_0"))  

       ID  AICnorm AIClnorm       mu       sd    mulog     sdlog
mean 87_0 2489.106 2512.976 7.681504 5.588817 2.173721 0.4584419
ddply(subset(disperseLer, ID == "87_0" | ID == "128_1"), "ID",  
  fit_dispersal_models)

     ID  AICnorm AIClnorm         mu        sd    mulog     sdlog
1 128_1 4388.528 4368.020 -41.254702 13.903769 1.723583 0.5179114
2  87_0 2489.106 2512.976   7.681504  5.588817 2.173721 0.4584419

Full run

all_Ler_fits <- ddply(disperseLer, "ID", fit_dispersal_models) 

Warning in sqrt(diag(varcovar)): NaNs produced
Warning in sqrt(1/diag(V)): NaNs produced
Warning in cov2cor(varcovar): diag(.) had 0 or NA entries; non-finite
result is doubtful

all_Ler_fits
      ID    AICnorm   AIClnorm           mu        sd    mulog     sdlog
1  100_0  210.19617  209.46451    4.0982669  2.598637 1.662753 0.3158626
2  101_0  602.77572  592.94699    5.1014741  3.277311 1.839458 0.3314258
3  104_0 3917.28412 3922.72793  -66.0205798 19.643467 1.781924 0.6310585
4  105_0 1749.42577 1745.88089    4.3716320  3.528681 1.788268 0.3682102
5  106_0  646.69449  643.17978 -207.7151988 30.384323 1.307769 0.7884596
6  107_0  785.44720  789.47637  -69.4381348 16.731328 1.359045 0.6932002
7  107_2 1063.19962 1049.11621  -70.4622184 16.881924 1.740646 0.4708113
8  108_0 3238.39521 3253.82893    5.1491554  5.733904 2.030345 0.4653378
9  109_0  986.23168  993.20962    0.3779014  6.555358 1.830760 0.5150772
10 118_0  204.70929  204.18931    5.0644220  2.761443 1.771622 0.3257241
11 119_0  159.50334  161.20702   -6.3007402 10.223254 1.877411 0.6364096
12 125_0 1535.73278 1535.21286    4.3296332  3.730306 1.802977 0.3811286
13 128_1 4388.52816 4368.01998  -41.2547017 13.903769 1.723583 0.5179114
14 131_0  238.69069  240.39449   -4.1084462  5.925933 1.515535 0.5133134
15 133_0  157.21511  155.12266    5.1511826  3.886229 1.897703 0.3621337
16 134_0  309.97239  302.59321    3.1582072  4.429201 1.839981 0.3572949
17 135_0 2047.52526 2051.75059    6.8726239  4.285294 2.044296 0.3991739
18  73_0 1532.73721 1536.47422    3.8581071  5.121355 1.901724 0.4439891
19  75_0 1268.28878 1260.59024    4.1641279  5.606950 1.984015 0.4344484
20  75_1  713.93094  712.15268    5.1754515  3.910981 1.888703 0.3829246
21  77_0 1155.49401 1152.37149    1.9264317  4.898033 1.780264 0.4203578
22  78_2 3036.48468 3056.08625    6.5758420  5.186990 2.078296 0.4447256
23  79_0   16.43216   16.74522    4.7295954  1.069369 1.567024 0.1989932
24  85_1  732.95641  721.38097    2.8431735  4.174413 1.781615 0.3625076
25  85_2 4355.97098 4360.51040    3.7176899  5.757882 1.957874 0.4585569
26  87_0 2489.10649 2512.97557    7.6815043  5.588817 2.173721 0.4584419
27  87_1 4843.91681 4931.83322    8.2138623  4.811877 2.161850 0.4406535
28  87_2 2518.89435 2497.62958    3.4870029  4.699188 1.861861 0.4004055
29  88_0  505.51054  498.61356    3.1493213  3.455050 1.709980 0.3395724
30  88_1  632.33299  627.93489    0.9579404  4.372635 1.677200 0.3862587
31  89_1 2374.42168 2376.20474  -34.9798971 15.374743 1.835881 0.6015254
32  90_0 1414.00768 1402.33081    4.8038650  4.040552 1.881511 0.3774301
33  90_1  127.58152  128.43750  -14.7687132  7.623299 1.276556 0.5866747
34  91_2 3391.73490 3426.96778    2.8737605  7.010140 1.991896 0.5304049
35  93_0 1029.93150 1033.78692    2.0282346  5.749507 1.853023 0.4719776
36  95_0  639.50220  636.32519   -4.6097825  6.728016 1.717181 0.4532293
37  98_0 1956.80573 1995.93158   12.3030722  4.651470 2.468955 0.3797439
38  99_0 1252.70855 1262.56877    2.1108985  6.230271 1.889331 0.5023501

Some AIC stats

sum(all_Ler_fits$AICnorm)
[1] 58230.28
sum(all_Ler_fits$AIClnorm)
[1] 58366.17
cbind(all_Ler_fits$ID, all_Ler_fits$AICnorm - all_Ler_fits$AIClnorm)
      [,1]        [,2]
 [1,]    1   0.7316622
 [2,]    2   9.8287266
 [3,]    3  -5.4438047
 [4,]    4   3.5448752
 [5,]    5   3.5147126
 [6,]    6  -4.0291707
 [7,]    7  14.0834126
 [8,]    8 -15.4337230
 [9,]    9  -6.9779437
[10,]   10   0.5199870
[11,]   11  -1.7036794
[12,]   12   0.5199187
[13,]   13  20.5081874
[14,]   14  -1.7037993
[15,]   15   2.0924486
[16,]   16   7.3791782
[17,]   17  -4.2253246
[18,]   18  -3.7370105
[19,]   19   7.6985462
[20,]   20   1.7782603
[21,]   21   3.1225217
[22,]   22 -19.6015753
[23,]   23  -0.3130643
[24,]   24  11.5754364
[25,]   25  -4.5394237
[26,]   26 -23.8690776
[27,]   27 -87.9164058
[28,]   28  21.2647714
[29,]   29   6.8969805
[30,]   30   4.3980948
[31,]   31  -1.7830564
[32,]   32  11.6768719
[33,]   33  -0.8559744
[34,]   34 -35.2328843
[35,]   35  -3.8554141
[36,]   36   3.1770114
[37,]   37 -39.1258517
[38,]   38  -9.8602207

So there’s variation among reps on which model fits better. In sum, the normal distribution is better. However, the parameter estimates are much more variable, in ways that really don’t make sense biologically, unless there is directionally biased dispersal that varies from rep to rep.

22 September 2017

Warning in .load.config(override.config): Your configuration file
is missing the following entries: data_loading_header, data_ignore,
logging_level, cache_loaded_data, sticky_variables. Defaults will be used.
Warning in .check.version(config): Your configuration is compatible with version 0.7 of the ProjectTemplate package.
  Please run ProjectTemplate::migrate.project() to migrate to the installed version 0.8.

The next steps in the Ler dispersal analysis are to confirm statistical support for among-rep heterogeneity and see if the kernel parameters are correlated with the fraction dispersing.

Housekeeping

Added the functions for the truncated distributions and to fit the models to lib/helpers.R.

Kernel heterogeneity

The total AIC for the rep-specific fits was 58230.28 and 58366.17 for the normal and lognormal models, respectively.

I think that if I pass the whole data set to the analysis function it will combine all the data. Let’s look at this:

sum(disperseLer$Seedlings[disperseLer$Distance > 4])
[1] 11928
data_loc <- subset(disperseLer, Distance > 4)
data_vec <- rep(data_loc$Distance, data_loc$Seedlings)
length(data_vec)
[1] 11928

That’s a match!

fit_dispersal_models(disperseLer)

       ID  AICnorm AIClnorm         mu       sd    mulog   sdlog
mean 73_0 59772.17 59799.07 -0.7149118 7.407026 1.905282 0.50168

Ignore the ID on the graphical and text output.

What we see are good fits overall. Again, the normal is somewhat better than the lognormal (\(\Delta_{\text{AIC}} =\) 26.9). But the rep-specific fits have AIC’s that are more than 1400 AIC units lower!

Correlations with fraction dispersing

Regenerate the datsets, without the plots (may want to put this into a munge script):

all_Ler_fits <- ddply(disperseLer, "ID", fit_dispersal_models, plot.it = FALSE) 
Warning in sqrt(diag(varcovar)): NaNs produced
Warning in sqrt(1/diag(V)): NaNs produced
Warning in cov2cor(varcovar): diag(.) had 0 or NA entries; non-finite
result is doubtful
Ler_dispersal_stats <- calc_dispersal_stats(disperseLer)
Ler_dispersal_stats <- merge(Ler_dispersal_stats, all_Ler_fits, by = "ID")
head(Ler_dispersal_stats)  
     ID Density Siliques Total_seeds Home_seeds Dispersing_seeds
1 100_0       1      126         131         73               58
2 101_0       1      177         336        194              142
3 104_0     390       NA        1384        628              756
4 105_0    1227       NA        1180        762              418
5 106_0     107       NA         389        257              132
6 107_0     199       NA         946        773              173
  Dispersal_fraction Mean_dispersal_all RMS_dispersal_all Mean_dispersal
1          0.4427481           2.580153          4.034205       5.827586
2          0.4226190           2.875000          4.663690       6.802817
3          0.5462428           4.593931          7.099011       8.410053
4          0.3542373           2.363559          4.197760       6.672249
5          0.3393316           2.611825          5.267156       7.696970
6          0.1828753           1.286469          3.335535       7.034682
  RMS_dispersal   AICnorm  AIClnorm          mu        sd    mulog
1      6.062889  210.1962  209.4645    4.098267  2.598637 1.662753
2      7.173896  602.7757  592.9470    5.101474  3.277311 1.839458
3      9.605168 3917.2841 3922.7279  -66.020580 19.643467 1.781924
4      7.052944 1749.4258 1745.8809    4.371632  3.528681 1.788268
5      9.041990  646.6945  643.1798 -207.715199 30.384323 1.307769
6      7.799881  785.4472  789.4764  -69.438135 16.731328 1.359045
      sdlog        se_mu       se_sd   se_mulog   se_sdlog
1 0.3158626  1.330679703 0.634838130 0.06392084 0.04807686
2 0.3314258  0.810572636 0.441435119 0.03321615 0.02656491
3 0.6310585          NaN         NaN 0.04842916 0.03103006
4 0.3682102  0.648281310 0.312409282 0.02473200 0.01922509
5 0.7884596  0.007112144 0.004535006 0.30807962 0.13543948
6 0.6932002 21.032531014 2.279795670 0.21971090 0.10060063

Now look at some plots

qplot(Dispersal_fraction, mulog, data = Ler_dispersal_stats) 

qplot(Dispersal_fraction, sdlog, data = Ler_dispersal_stats) 

qplot(Dispersal_fraction, mu, data = Ler_dispersal_stats) 

qplot(Dispersal_fraction, sd, data = Ler_dispersal_stats) 

There is a clear pattern with mulog, but nothing else. This is another reason to favor the lognormal distribution, as it indicates a mechanistic interconnection between the two components of the dispersal kernel. Here’s the actual relationship:

cor(Ler_dispersal_stats$mulog, Ler_dispersal_stats$Dispersal_fraction) 
[1] 0.5604027
summary(lm(mulog ~ Dispersal_fraction, data = Ler_dispersal_stats))

Call:
lm(formula = mulog ~ Dispersal_fraction, data = Ler_dispersal_stats)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.46413 -0.07753  0.00646  0.09439  0.49780 

Coefficients:
                   Estimate Std. Error t value Pr(>|t|)    
(Intercept)         1.54775    0.07454   20.76  < 2e-16 ***
Dispersal_fraction  0.66056    0.16271    4.06 0.000253 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1928 on 36 degrees of freedom
Multiple R-squared:  0.3141,    Adjusted R-squared:  0.295 
F-statistic: 16.48 on 1 and 36 DF,  p-value: 0.0002531

Implications for modeling

A simple-minded approach would be to draw a random dispersal fraction, and then use the regression (including the residuals) to get a value for mulog. But really, we should be doing draws from a multivariate distribution. The challenge, then, is taking into account the (presumably) beta distribution for the dispersal fraction. We can use Morris & Doak’s approach of simulating multivariate normals and then matching quantiles to get the beta distribution; since the latter is pretty symmetric (looks like the mean is around 0.4) it should recover the correlation coefficient pretty closely.

A bigger conceptual issue is that the points in the scatterplots above are each measured with error, in both directions (binomial sampling for the dispersal fraction, MLE standard errors in mulog). This is another reason to avoid OLS regression; but does it introduce bias into the estimate of the correlation? There may be a solution to this at https://www.rasch.org/rmt/rmt101g.htm

Third, is sdlog independent of mulog (we have already seen that it’s independent of dispersal fraction)? What about its relationship to the major and minor axes of the mulog-dispersal fraction distribution?

Finally, when we are working with the populations, do we assume that all pots in a replicate at a particular generation experience the same dispersal kernel? Or that each pot gets an independent sample of the kernel? If most of the spread is driven by seeds coming from the front pot, then it won’t matter much.

Disattenuated correlation coefficient

Actually, I’ve convinced myself that I don’t need to do this. We will be using the covariance rather than the correlation, and some calculations reveal that the covariance is not biased by measurement error. The disattenuation calculation is to correct for the inflated variances caused by measurement error that are used in calculating the correlation from the covariance.

Of course, I will need to get those corrected variances for population the VC matrix to use in simulation. The way to do this is to subtract the mean squared standard error from the raw variance estimates.

I’ve modified the kernel fitting routine to return the standard errors of the parameter estimates. So now for the corrections:

var(Ler_dispersal_stats$mulog)
[1] 0.05273925
var(Ler_dispersal_stats$mulog) - mean((Ler_dispersal_stats$se_mulog)^2)
[1] 0.03825842
Ler_dispersal_stats <- mutate(Ler_dispersal_stats,
                              binom_var = Dispersal_fraction * (1 - Dispersal_fraction) /
                                Total_seeds)
var(Ler_dispersal_stats$Dispersal_fraction)
[1] 0.03795841
var(Ler_dispersal_stats$Dispersal_fraction) - mean(Ler_dispersal_stats$binom_var)
[1] 0.03740287

The first correction is substantial, the second one minor.

Patterns of sdlog

qplot(mulog, sdlog, data = Ler_dispersal_stats) 

disp_pca <- princomp(~ mulog + Dispersal_fraction, data = Ler_dispersal_stats)
screeplot(disp_pca)

biplot(disp_pca)

qplot(disp_pca$scores[,1], Ler_dispersal_stats$sdlog)

qplot(disp_pca$scores[,2], Ler_dispersal_stats$sdlog)

summary(lm(Ler_dispersal_stats$sdlog ~ disp_pca$scores[,2]))

Call:
lm(formula = Ler_dispersal_stats$sdlog ~ disp_pca$scores[, 2])

Residuals:
     Min       1Q   Median       3Q      Max 
-0.21719 -0.07239 -0.01247  0.06634  0.26884 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)           0.45126    0.01783  25.309   <2e-16 ***
disp_pca$scores[, 2] -0.27682    0.13037  -2.123   0.0407 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1099 on 36 degrees of freedom
Multiple R-squared:  0.1113,    Adjusted R-squared:  0.08661 
F-statistic: 4.508 on 1 and 36 DF,  p-value: 0.04067
cov(subset(Ler_dispersal_stats, select = c(mulog, sdlog, Dispersal_fraction)))
                          mulog         sdlog Dispersal_fraction
mulog               0.052739247 -0.0086961783       0.0250738618
sdlog              -0.008696178  0.0132258602       0.0001367011
Dispersal_fraction  0.025073862  0.0001367011       0.0379584103
cor(subset(Ler_dispersal_stats, select = c(mulog, sdlog, Dispersal_fraction)))
                        mulog        sdlog Dispersal_fraction
mulog               1.0000000 -0.329267986        0.560402749
sdlog              -0.3292680  1.000000000        0.006101068
Dispersal_fraction  0.5604027  0.006101068        1.000000000

It looks like there is heteroskedasity (sdlog is larger for high values of PC1, which is low values of mulog), and a weak negative association with PC2.

25 Sep 2017

Housekeeping

After repeated nagging from ProjectTemplate, I ran the script to update the project structure to v. 0.8. This changed the doc directory to docs, and added some new variables to global.dcf. Unfortunately the latter don’t seem to be documented…. I’ll also need to make sure that I update the package on my computer at home.

I added a new munge script to calculate the dispersal stats and fits. To prevent this calculating every time, I cached the result; but now I need to add caching to all the munge scripts, and, after running once, turn off caching… Done!